Methods for identifying and modulating co-occurant cellular phenotypes

ABSTRACT

The present invention provides tools and methods for the systematic analysis of genetic interactions between cells. The present invention provides tools and methods for modulating cell phenotypes and compositions, combinatorial probing of cellular circuits, for dissecting cellular circuitry, for delineating molecular pathways, and/or for identifying relevant targets for therapeutics development.

INCORPORATION BY REFERENCE

This application is a Divisional Application of U.S. Ser. No.16/085,938, filed Sep. 17, 2018, which is a national stage applicationof PCT/US2017/023054, filed Mar. 17, 2017, which claims priority andbenefit of U.S. provisional application Ser. No. 62/309,680 filed Mar.17, 2016, all of which are incorporated herein by reference.

All documents cited or referenced in the application cited documents,and all documents cited or referenced herein (“herein cited documents”),and all documents cited or referenced in herein cited documents,together with any manufacturer's instructions, descriptions, productspecifications, and product sheets for any products mentioned herein orin any document incorporated by reference herein, are herebyincorporated herein by reference, and may be employed in the practice ofthe invention. More specifically, all referenced documents areincorporated by reference to the same extent as if each individualdocument was specifically and individually indicated to be incorporatedby reference.

FIELD OF THE INVENTION

The present invention relates to molecular profiling at the single-celllevel as well as populations of cells. The present application providesa spatially—resolved single-cell RNA sequencing (scRNA-Seq) approach tolink neighboring cell states/phenotypes and use it to proposeextracellular regulators (microenvironmental factors, cell-cellinteractions) of cellular behaviors, which can be validated and screenedfor clinical relevance. The present invention also enables a systematic,“on-the-fly” technique for spatially patterning and/or barcoding cellson a surface (e.g. inorganic, organic, or biological) in a user-desiredarrangement with the ability to later release target cells viaenzymatic, chemical, and/or photo-cleavable methods. Additionally, thepresent invention is relevant for therapeutics target discovery.

BACKGROUND OF THE INVENTION

While the cell is the least common denominator of life, a multicellularorganism would not function with its constituent cells acting inisolation. Intercellular communication, both at a distance and viadirect contact, is crucial to performing all of life's functions—fromneurons telling muscle cell to contract and embryonic stem cellsdifferentiating to immune cells coordinating systems-level defenses.Nevertheless, the impact of cell-to-cell communication on cellularphenotypes remains an understudied, and poorly understood, area ofbiology. The following examples and descriptions are placed in thecontext of the immune system and related examples, but are not intendedto be limited to these systems. Rather they are meant to encompass anyand all cells and cellular interactions.

Immune cells are the primary defenders of our bodies against illness.Precision and accuracy in their actions are crucial for our health:failure to sense and respond can make us susceptible to illness, whileinappropriate sensing can lead to autoimmune disease. Understanding themolecular circuits that control immune cell behaviors is a fundamentalbiological goal offering untold clinical possibilities.

Cellular heterogeneity is a hallmark of the immune system and isessential for protecting us against myriad, evolving pathogenic threats.Thus, immune isolates from clinical samples, such as biopsies, blood,and synovia, all consist of complex cell mixtures. To date, genomicanalyses of clinical samples have relied on either profiling thisheterogeneous mixture or first sorting sub-populations and thenprofiling them. The former strategy only provides an average, teachingus more about the component cell types than their states; the latter islimited to known sub-populations and sorting panels, can be difficult toimplement for small samples (<1 million cells), and masks any variationwithin the sub-population, which can be substantial. Indeed, recentapproaches for deciphering complex cell circuits combine genomicprofiling to measure a circuit's components or interactions,computational algorithms to infer a model from those profiles andperturbation techniques (e.g., RNAi) to test and refine it. However,there are two major challenges in applying this strategy to primaryimmune cells. First, naïve immune cells are notoriously difficult toperturb with traditional transfection methods. Second, a typicalimmune-cell population contains many different cell subtypes and statesand ensemble-based profiling methods cannot accurately measure thepopulation's constituents, much less how their coordinated behaviorsdetermine systems-level responses. These two hurdles have severelylimited understanding of the circuits that dictate immune-celldevelopment, behavior, response and function, in both healthy anddiseased states.

Citation or identification of any document in this application is not anadmission that such document is available as prior art to the presentinvention.

SUMMARY OF THE INVENTION

The invention comprehends a method to develop and leverage a novelplatform for spatially-resolved single cell genomic profiling of tissuesto identify novel intercellular communication genes and gene-products,more specifically immune evasion genes or gene-products fromintratumoral heterogeneity. This method provides an integrated, singleplatform to measure changes in gene expression directly from singlecells. The connection between intratumoral heterogeneity and localimmune cell phenotypes is lost when averaging across a population (e.g.,a large piece of tumor) or disaggregating tissues for single-cellRNA-Seq (scRNA-Seq). The present application provides aspatially—resolved scRNA-Seq approach—or general method for spatiallyresolved cellular profiling—to link neighboring cell states/phenotypesand use it to propose regulators of immune cell suppression, which canbe validated and screened for clinical relevance.

The ability to evade immune attack is one of the cardinal features ofcancer (Hanahan, D. & Weinberg, R. A. “Hallmarks of Cancer: The NextGeneration” Cell 144, 646-674, doi:10.1016/j.cell.2011.02.013 (2011)),but it is a trait that is heterogeneous across a population of tumorcells. Single-cell profiling of tumors has begun to reveal the existenceof multiple sub-states of tumor cells (Patel, A. P. et al. “Single-cellRNA-seq highlights intratumoral heterogeneity in primary glioblastoma”Science 20, 1396-1401, doi:10.1126/science.1254257 (2014)), but twocrucial gaps remain in the understanding of intratumoral heterogeneity.First, the repertoire of immune-evasion mechanisms utilized by tumorcells is not known, and how it differs across distinct tumor regions.Second, it is not known whether heterogeneity in the expression ofimmune evasion molecules by individual tumor cells impacts the state oftumor infiltrating lymphocytes (TIL) locally or globally.

Specifically, the invention provides a method to develop and leverage aninnovative platform for spatially-resolved, single-cell genomicprofiling of tissues to identify novel immune evasion genes fromintratumoral heterogeneity. Applicants have developed a robustexperimental platform for spatially tagging cells in tumor sections withunique sets of oligonucleotides prior to scRNA-Seq. By linkingneighboring tumor and immune cell behaviors, Applicants identify TILstates and their correlation with local tumor cell gene expression.Using this workflow, Applicants are able to identify candidate tumorcell genes that alter the number and differentiation state of localimmune infiltrates.

The present application also provides a method to perturb candidateimmune evasion molecules to validate their effects on tumor immunity.Applicants use over-expression (lentivirus or CRISPRa) and knockout(CRISPRi)—or other biological or chemical perturbation methods known tothe art—to test and validate the putative immune evasion genesidentified in the foregoing. Applicants use pooled screens to monitorfor the selective survival of modified tumor cells in the presence ofimmune pressure, assessing how overexpression/knockout of immune evasiongenes causes selective accumulation/loss of genetically modified tumorcells relative to wild-type counterparts. For genes that score in thesescreens, Applicants thoroughly characterize their roles in tumorimmunity by microscopy, flow cytometry, and scRNA-Seq. In parallel,Applicants also perturb and characterize predicted regulators in immunecells.

The present application also provides tools to spatially pattern cellson a surface (e.g. inorganic, organic, or biological) in a user-directedfashion. Applicants and others can use the molecular platform describedherein to control the arrangement of cells and control release viaenzymatic, chemical, and/or photo-cleavable means for downstream genomicprofiling.

Applicants also developed a method to map intratumoral heterogeneity ofcancer and immune cells in clinical melanoma samples. Directinteractions between TILs and malignant (e.g., melanoma) cells, TILs andstromal cells, or malignant and non-malignant cells may have majorimplications for disease progression and treatment strategies in theclinic.

The immune system plays a crucial role in fighting cancer. The largenumber of genetic alterations inherent to most cancer cells providesmyriad tumor-associated neo-antigens that the host immune system canrecognize (Yadav, M. et al. Predicting immunogenic tumour mutations bycombining mass spectrometry and exome sequencing. Nature 515, 572-576,doi:10.1038/nature14001 (2015)). Nevertheless, the coexpression ofsurface and secreted molecules by tumor cells has been shown to confertumors with the ability to evade immune responses (Fridman, W. H., etal. The immune contexture in human tumours: impact on clinical outcome.Nature Reviews Cancer 12, 298-306, doi:10.1038/nrc3245 (2012)). Severalrecent trials (Victor, C. T.-S. et al. Radiation and dual checkpointblockade activate non-redundant immune mechanisms in cancer. Nature,1-18, doi:10.1038/nature14292 (2015); Ott, P. A., et al. CTLA-4 andPD-1/PD-L1 Blockade: New Immunotherapeutic Modalities with DurableClinical Benefit in Melanoma Patients. Clin. Cancer Res. 19, 5300-5309,doi:10.1158/1078-0432.CCR-13-0143 (2013); Tumeh, P. C. et al. PD-1blockade induces responses by inhibiting adaptive immune resistance.Nature 515, 568-571, doi:10.1038/nature13954 (2014)) have demonstratedthat blocking immune checkpoint receptors (normally engaged by tumorcells to suppress immune responses) can generate striking clinicalresponses against a range of tumors. However, the clinical impact oftumor immunity in patients with cancer is variable and many patientsfail to respond to immunotherapy (Victor et al. 2015; Ott et al. 2013;Tumeh et al. 2014). Overall, the cellular and molecular mechanisms thatresult in response or resistance to immunotherapy are poorly understood.

Cancer cells, even from the same tumor, are highly heterogeneous (due tosomatic evolution, genomic instability, and differences in epigeneticstate (Patel et al. Science 2014; Driessens, G., Beck, B., Caauwe, A.,Simons, B. D. & Blanpain, C. Defining the mode of tumour growth byclonal analysis. Nature 488, 527-530, doi:10.1038/nature11344 (2013);Gerlinger, M. et al. Intratumor Heterogeneity and Branched EvolutionRevealed by Multiregion Sequencing. The New England journal of medicine366, 883-892, doi: 10. 1056/NEJMoa1113205 (2012); Navin, N. et al.Tumour evolution inferred by single-cell sequencing. Nature 472, 90-94,doi:10.1038/nature09807 (2012); Schepers, A. G. et al. Lineage TracingReveals Lgr5+ Stem Cell Activity in Mouse Intestinal Adenomas. Science(New York, N.Y.) 337, 730-735, doi:10.1126/science.1224676 (2012);Yachida, S. et al. Distant metastasis occurs late during the geneticevolution of pancreatic cancer. Nature 467, 1114-1117,doi:10.1038/nature09515 (2010); Eppert, K. et al. Stem cell geneexpression programs influence clinical outcome in human leukemia. NatureMedicine 17, 1086-1093, doi:10.1038/nm.2415 (2011)) and this variabilityhas been shown to underscore drug resistance and tumor relapse (Bedard,P. L., Hansen, A. R., Ratain, M. J. & Siu, L. L. Tumour heterogeneity inthe clinic. Nature 501, 355-364, doi:10.1038/nature12627 (2013);Marusyk, A., Almendro, V. & Polyak, K. Intra-tumour heterogeneity: alooking glass for cancer? Nature Reviews Cancer 12, 323-334,doi:10.1038/nrc3261 (2012)). Increasing evidence suggests that there isalso intratumoral heterogeneity in the ability of tumor cells to evadeimmunity. First, infiltrating immune cells are often not evenlydispersed throughout the tumor, but instead are clustered (Azimi, F. etal. Tumor-infiltrating lymphocyte grade is an independent predictor ofsentinel lymph node status and survival in patients with cutaneousmelanoma. J Clin Oncol 30, 2678-2683, doi:10.1200/JCO.2011.37.8539(2012)). Second, the expression of immune evasion genes, such as PD-L1,is not monomorphic throughout a tumor, but rather is restricted to asub-population of cells (Tumeh et al. 2014; Topalian, S. L. et al.Safety, Activity, and Immune Correlates of Anti— PD-1 Antibody inCancer. The New England Journal of Medicine 366, 2443-2454,doi:10.1056/NEJMoa1200690 (2012)). These data suggest that there isgeographic variation in the tumor gene expression and mutation burdenthat drives immune activity and the interactions between tumor andimmune cells. However, the basis for these regional enrichments orpaucities is not known, nor is their significance for tumor biology orclinical course.

The following further underscores some of the intricacies involved whenstudying health and disease, including host-pathogen interactions, aswell as host-specific diversity associated with particular cell types,or cell type subpopulations. It is thus clear that there is a need inthe art to further unravel complex immune system heterogeneity andsynergies, in particular to establish cellular networks and cellinteractions, aiming at improving diagnostic or therapeutic efforts.

The invention comprehends, and the invention provides each aspect asdiscussed herein below:

The invention comprehends providing a cell functionalizing probecomprising a polyadorned molecule, wherein the molecule is adorned withanywhere from 2 to 5 groups: optionally a label attached to thepolyadorned molecule; a cell-surface reactive group attached to thepolyadorned molecule, wherein the reactive group is selectivelyactivated by light or any other method known in the art; abio-orthogonal reactive group attached to the polyadorned molecule;optionally a second reactive group attached to the polyadorned molecule;and, optionally a group to improve water solubility. In an embodiment ofthe invention, the polyadorned molecule of the cell functionalizingprobe is a single aromatic molecule, e.g., benzene, dihydroxyaryl ortriazine. In another embodiment of the invention, the polyadornedmolecule of the cell functionalizing probe comprises a functionalmoiety, e.g., NHCO, NHCCH₂, NHCN, or NHCS. In another embodiment, thelabel of the cell-surface functionalizing probe is a fluorophore, apeptide based-tag, biotin, affinity reagent, hapten, lanthanide heavymetal (or lanthanide heavy metals or combination thereof) or anoligonucleotide. In an embodiment of the invention, the bio-orthogonalreactive group of the cell functionalizing probe is an alkyne, strainedalkyne, alkene, or strained alkene. The present application contemplatescell functionalizing probes which can be activated via copper chemistry,copper-free chemistry, photoclick chemistry, or synthesized byinverse-demand Diels Alder.

By adornment is meant addition to or presence of a functional moiety, orsubstituent, linked to a molecular core. In certain non-limitingembodiments, the molecular core is aromatic, for example, withlimitation, benzene, dihydroxyaryl, and triazine. A cell functionalizingprobe of the invention is said to be “polyadorned” in that the probecomprises a molecular core substituted with 2 or more functionalsubstituents as set forth herein. In an embodiment of the invention, thetwo or more substituents comprise a cell-surface reactive group, forinstance in a non-limiting example a cell-surface reactive group that isselectively activated by light, and a bio-orthogonal substituent, forinstance in a non-limiting example a bio-orthogonal substituent capableof being linked by click chemistry to a moiety comprising apolynucleotide barcoded tag. Bio-orthogonal is used in its usual senseand refers to a substituent functional in a living system that does notinterfere substantially with native biochemical processes.

In an embodiment, the cell functionalizing probe comprises a reactivegroup, wherein the reactive group is a photoactivated cell-surfacereactive group. In another embodiment, the photoactivated cell-surfacereactive group of the cell functionalizing probe is a benzophenone,azide, or diazirine, wherein the group is activated to become acarbon-centered radical, nitrene, or carbene, respectively.

In an aspect of the invention, the invention provides a cellfunctionalizing barcoded tag comprising a polyadorned molecule, whereinthe molecule is adorned with anywhere from 2 to 5 groups; a fluorophore,peptide-based tag, biotin, hapten, affinity reagent, lanthanide heavymetal(s) or combination thereof, or oligonucleotide label attached tothe polyadorned molecule; a paired bio-orthogonal reactive group to thebio-orthogonal group of the cell functionalizing probe (e.g., azide,nitrone, tetrazine, or tetrazole attached to the polyadorned molecule);and, an oligonucleotide, fluorophore, peptide, affinity reagent, biotinor other specific barcode comprising a spatial barcode (in an embodimentthis consists of a scRNA-seq compatible handle), wherein the barcode isattached to the polyadorned molecule. In another embodiment, the labelof the cell functionalizing barcoded tag is a fluorophore, apeptide-based tag, biotin, or a cyanine-based dye. Examples ofpeptide-based tag comprises FLAG-tag, V5 tag, HA-tag, AviTag,Calmodulin-tag, polyglutamate tag, E-tag, His-tag, Myc-tag, S-tag,SBP-tag, Softag 1, Softag 3, Strep-tag, TC tag, VSV-tag, or Xpress tag,or any other similar tag known in the art. In another embodiment, thescRNA-seq method is smart-seq2, TruSeq, CEL-Seq, Drop-Seq, In-drop Seq,STRT, ChIRP-Seq, GRO-Seq, CLIP-Seq, Quartz-Seq, or any other similarmethod known in the art (see, e.g., “Sequencing Methods Review”Illumina® Technology,https://www.illumina.com/content/dam/illumina-marketing/documents/products/research_reviews/sequencing-methods-review.pdf.

In another aspect, the invention provides a cell functionalizingbarcoded tag comprising 2 to 5 functional moieties. In an embodiment,the barcoded tag comprises a label selected from the group consising offluorophore, peptide-based tag, biotin, hapten, affinity reagent,lanthanide heavy metal(s) or combination thereof, or oligonucleotide. Inanother embodiment, the label of the cell functionalizing barcoded tagis a fluorophore, a peptide-based tag, biotin, or a cyanine-based dye.In another embodiment, the barcoded tag a paired bio-orthogonal reactivegroup to the bio-orthogonal group of the cell functionalizing probe(e.g., azide, nitrone, tetrazine, or tetrazole attached to thepolyadorned molecule). In another embodiment, the barcoded tag comprisesan oligonucleotide, fluorophore, peptide, affinity reagent, biotin orother specific barcode comprising a space barcode or an elongated spacebarcode, optionally a 5′ handle, and optionally a poly A tail.

In another aspect, the invention provides a cell functionalizingbarcoded tag (“oligo tag”) for patterning surfaces.

In another aspect, the invention provides a continuous method ofsingle-cell profiling in a subject in need thereof wherein the singlecells are spatially resolved, the method comprising (a) saturating cellsin the subject in need thereof with a cell functionalizing probe; (b)activating the cell functionalizing probe with light (e.g., UV); (c)labelling cells with a cell functionalizing barcoded tag; (d) washingthe tissue with an aqueous solution, wherein the solution removes extrafunctionalizing probe; (e) repeating steps (a) through (d) anywhere from1 to about 100 (and potentially more) times, (f) separating the labeledcellular ensemble (e.g., a tissue) into a suspension of single cells orsmall cell aggregates by any method known in the art; (g) optionallysorting and enriching cells comprising a cell functionalizing tag andcell functionalizing probe via flow cytometry or any cell separationmethod (e.g. magnetic isolation) known to the art; (h) profiling singlecell sequences, whole cell populations, or cell subpopulations; and, (i)optionally assembling the single cell sequences into a visualrepresentation, wherein the relationship between amplified sharedspatial barcodes of single cell sequences is obtained by a computationalmethod; and, (j) analyzing cellular phenotypes using categorical spatialinformation.

In an aspect of the invention, the present application provides acontinuous method of single-cell profiling in a subject in need thereofwherein the single cells are spatially resolved, the method comprising:(a) conjugating a cell functionalizing probe to a cell functionalizingbarcoded tag, whereby an active complex is formed; (b) saturating tissuein the subject in need thereof with the active complex as describedaccording to the cell functionalizing probes provided herein; (c)activing the cell functionalizing probe; (d) washing the tissue with anaqueous solution, wherein the solution removes excess active complex;(e) repeating steps (a) through (d) anywhere from 1 to 100 times; (f)separating the the labeled cellular ensemble (e.g., a tissue) into asuspension of single cells or small cell aggregates; (g) optionallysorting and enriching cells comprising a cell functionalizing tag and acell functionalizing probe via a cell separation method; (h) profilingsingle cell sequences; (i) optionally assembling the single cellsequences into a visual representation, wherein the relationship betweenamplified shared spatial barcodes of single cell sequences is obtainedby a computational method; and (j) optionally using spatial informationas a categorical variable for downstream computational analysis.

In another aspect, the invention provides a method for spatiallypatterning specific cells on surfaces (e.g. inorganic, organic, orbiological). This is currently achievable (Todhunter et al. NatureMethods 2015), however Applicants' method has the specific advantage ofenabling user-defined cellular placement that can be adjusted inreal-time as opposed to pre-printing of oligonucleotides on surfaces. Inan embodiment of this method, the cell functionalized barcode would beconjugated to cells using a non-specific, e.g., NETS-ester ligation orcholesterol (3′ cholesterol-TEG) or specific chemistry and the surfacewould be patterned with the cell functionalized probe (viaphotoactivation or another user-controlled activation scheme). Byflowing the barcoded cells over the surface, some will be attached viathe click (or other) functionalization paired on both molecules. Inanother embodiment of this method, the surface is patterned with thecell functionalized probe labeled with an oligonucleotide; cells canthen be flowed over (streamed) and conjugated non-specifically (e.g. viaNETS-ester ligation) or specifically. In an aspect of the invention,whole tissues or vibratome-sliced portions of biopsied human samples orwhole mouse organ can be directly applied onto pre-barcoded surfaces. Inanother embodiement of this method the surface is patterned with thecell functionalized probe labeled with an oligonucleotide and cellslabeled with complementary oligonucleotides can then be adhered to thesurface in a specific manner.

In another aspect, the invention provides a method of spatiallypatterning cells on surfaces wherein single cells are spatiallylocalized, the method comprising: (a) assembling a cell functionalizedbarcode conjugated to cell(s); (b) assembling a surface patterned with acell functionalized probe; and a surface, wherein the surface ispatterned with a cell functionalized probe; (c) streaming cellsconjugated with a cell functionalized barcode over the surface patternedwith a cell functionalized probe, whereby cells are attached to the cellfunctionalized probe via complementary pairing chemistry. In anembodiment, the complementary pairing chemistry is click functionalizingpairing or oligonucleotide complementarity.

In another aspect, the invention provides a method of spatiallypatterning cells on surfaces wherein single cells are spatiallylocalized, the method comprising: (a) aseembling a cell functionalizedbarcaode conjugated to cell(s); (b) aseembling a surface (biological,inorganic, or organic) patterned with a cell functionalized probe,wherein the surface is patterned with a cell functionalized probelabelled with an oligonucleotide; (c) streaming cells conjugated with acell functionalized barcode over the surface patterned with a cellfunctionalized probe, whereby cells are conjugated non-specifically;and, (d) optionally analyzing cellular phenotypes using spatialinformation. In an embodiment, the complementary pairing chemistry isclick functionalizing pairing or oligonucleotide complementarity.

In another aspect, the invention provides a method of spatiallypatterning cells on surfaces wherein single cells are spatiallylocalized, the method comprising: (a) assembling a cell functionalizedbarcode conjugated to cell(s); (b) aseembling a surface (biological,inorganic, or organic) with a cell functinalized probe, wherein thesurface is patterned with the cell functionalized probe is labelled withcomplementary oligonucleotides; (c) streaming cells conjugated with acell functionalized probe over the surface, whereby cells are conjugatedspecifically; and, (d) optionally analyzing cellular phenotypes usingspatial information. In an embodiment, the complementary pairingchemistry is click functionalizing pairing or oligonucleotidecomplementarity.

In an another aspect, the invention provides a method for a cellfunctionalizing probe or a cell functionalizing barcoded tag, whereinthe bio-orthogonal reactive group comprises a compound of Formula (I):

wherein 10 is selected from the group consisting of —H, —X,—(CH₂)_(a)—NH—PG¹, —O—(CH₂CH₂O)aC CH₂)_(c)NH₂—PG¹,—O—(CH₂CH₂O)_(a)—PG²—(CH₂)_(a)—O—PG²;

-   -   R² is selected from the group consisting of        CO(CH₂)_(a)NHCO(CH₂)_(a)O(CH₂CH₂O)_(c)(CH₂)_(a)NHR²¹—; R²¹ is        selected from the group consisting of —H, —O(C₁-C₆ alkyl),        —C₁-C₈ straight chain alkyl, —C₁-C₈ branched alkyl, —C₂-C₈        alkenyl, —C₂-C₈ alkynyl, —(C₁-C₆ alkyl)-O—(C₁-C₆ alkyl);    -   PG¹ is an amine protecting group or —H;    -   PG² is an alcohol protecting group or —H;    -   X is selected from the group consisting of Cl, Br, I, F;        -   a is independently any integer between 0 and 6;        -   c is independently any integer between 1 and 6.

In an embodiment of the invention, PG¹ is any amine protecting groupknown to one of skill in the art. Mention is made of T. W. Greene & P.G. M. Wuts Protective Groups in Organic Synthesis (4th edition) J. Wiley& Sons (2006) and P. J. Kocienski Protecting Groups Georg Thieme Verlag(1994) herein incorporated by reference. Embodiments of protectinggroups include, but are not limited to:

In an embodiment of the invention, PG² is any alcohol protecting groupknown to one of skill in the art. Mention is made of T. W. Greene & P. GM . Wuts Protective Groups in Organic Synthesis (4th edition) J. Wiley &Sons (2006) and P. J. Kocienski Protecting Groups Georg Thieme Verlag(1994) herein incorporated by reference. Embodiments of protectinggroups include, but are not limited to:

or any silyl ether thereof;

In an another aspect, the invention provides a method for a cellfunctionalizing probe or a cell functionalizing barcoded tag wherein thefluorophore comprises a compound of Formula II:

wherein R³ is selected from the group consisting of —H, C₁-C₆ alkyl,—(CH₂)_(a)—NR⁸R⁹, —NHC(O)—Y—R⁸, —NHC(O)CHR⁸R⁹, —CHR⁸R⁹;—(CH₂)_(a)—NR⁸C(O)R⁹,

-   -   R⁴ is selected from the group consisting of —H, —OH, and —OR⁸;    -   R⁵ and R⁶ are selected from the group consisting of —H, —OH, —X,        —NO₂, —CN, —NH₂, —NHR⁸; —C(O)R⁸, —C₁-C₃ perfluoro alkyl; R⁷ is        selected from the group consisting of —H, —OH,—X, —NO₂, —CN,        —C(O)NH(CH₂)a —O(C₁-C₆ alkyl), —NH₂, —NHR⁸, —NHC(O)—Y—R⁸,        —(CH₂)_(a)—NR⁶⁸C(O)R⁹, —NHC(O)CHR⁸R⁹, —C₁-C₃ perfluoro alkyl;    -   R⁸ and R⁹ are independently selected from the group consisting        of —H, NH₂, —(CH₂)_(a)—C(O)NH(CH₂)_(b)CH₃,        —(CH₂)_(a)—C(O)NH(CH₂)_(b)C(O)NHPG³, —(CH₂)_(a)—CO₂NHPG³, —C₁-C₈        straight chain alkyl, —C₁-C₈ branched alkyl, —C₂-C₈ alkenyl,        —C₂-C₈ alkynyl, —(C₁-C₆ alkyl)-O—(C₁-C₆ alkyl) each of which is        optionally substituted by a halogen, ether, vinyl group, allylic        group, —NH₂, or —CN, —(CH₂)_(a)NR⁸⁸R⁸⁹, —(CH₂)_(a)—C(O)NR⁸⁸R⁸⁹,        an aromatic group, heteroaromatic group, C₃-C₇ cycloalkyl, a        three to twelve membered heterocyclic having up to 3 heteroatoms        each of which preceding cyclic group is optionally substituted        from 1 to 3 substituents independently selected from a halogen,        —C₁-C₆ alkyl, —C₂-C₆ alkenyl, —O(C₁-C₆ alkyl), —C(O)—, —OH,        —NH₂, —CN, and —C₁-C₃ perfluoro alkyl;

R⁸⁸ and R⁸⁹ are independently selected from the group consisting of —H,—O(C₁-C₆ alkyl), —C₁-C₈ straight chain alkyl, —C₁-C₈ branched alkyl,—C₂-C₈ alkenyl, —C₂-C₈ alkynyl, —(C₁-C₆ alkyl)-O—(C₁-C₆ alkyl);

-   -   Y is selected from a covalent bond, —O—, —NH—, and -C₁-C₆ alkyl;    -   X is selected from the group consisting of Cl, Br, I, F;    -   PG³ is any photolabile protecting group;        -   a is independently any integer between 0 and 6;        -   b is independently any integer between 0 and 6.

In an embodiment of the invention, PG³ is any photolabile protectinggroup known to one of skill in the art. Mention is made of T. W. Greene& P. G. M. Wuts Protective Groups in Organic Synthesis (4th edition) J.Wiley & Sons (2006); P. J. Kocienski Protecting Groups Georg ThiemeVerlag (1994); and C. G. Bochet “Photolabile Protecting Groups andLinkers” J. Chem. Soc., Perkin Trans. 1, 2002, 125-142; hereinincorporated by reference. Embodiments of photolabile protecting groupsinclude, but are not limited to:

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV:

wherein A′ comprises a benzophenone, an azide, or a diazirine;

-   -   B′ comprises a fluorophore, a peptide based-tag, a biotin, an        affinity reagent, a hapten, one or more lanthanide heavy        metal(s), or an oligonucleotide;    -   C′ comprises an alkyne, a strained alkyne, an alkene, or a        strained alkene;    -   Z₁, Z₂, and Z₃ are each independently —CH₂—, —O—, —S—, or —N—;        and    -   m is an integer from 1 or 2;    -   n is an integer from 0, 1, or 2; and    -   p is an integer from 1 or 2,    -   wherein m+n+p is less than or equal to 6.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinn is 0 or 1.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinn is 1, m is 1, and p is 1.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinn is 0, and Z₁-A′ and Z₃-C′ are para to each other.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinn is 0, and Z₁-A′ and Z₃-C′ are meta to each other.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinA′ comprises a diazirine.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinA′ comprises a benzophenone.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinA′ comprises an azide.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinB′ comprises a biotin.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinB′ comprises a fluorophore.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinB′ comprises a oligonucleotide.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinB′ is a compound of Formula II.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinC′ comprises a strained alkyne.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinC′ comprises a strained alkene.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinA′ is

-   -   B′ is —H or

-   -   C′ is

-   -   L is a linker comprising (CH₂CH₂O)_(d), and d is an integer from        0 to 50;    -   g is an integer from 0, 1, 2 or 3;    -   R′ is —H, —X, CH₃, or CX₃, wherein X is —F, —Cl, —Br, or —I;    -   R″ is aryl or C₁₋₃alkylaryl; and    -   R¹⁰ is —CO(CH₂)_(i)NHCO—, wherein i is an integer from 0, 1, 2,        3, or 4;    -   R¹¹ is —H, C1-3alkyl, optionally substituted with halogen;    -   R¹² is each independently a hydrogen, alkyl, —OH, alkoxy, amino,        ester, —O-L-R¹³;    -   R¹³ is an alkyl, hydroxyl, alkoxy, amino;    -   R″ is each independently —H, —OH, alkoxy, —COOH, —COC₁₋₃alkyl,        —COH, amino, and L-O—R¹⁵;    -   R¹⁵ is —H or alkyl;    -   Z₁, Z₂, and Z₃ are each independently —CH₂—, —O—, —S—, or —N—;    -   Q is a heteroatom, such as —NH—, —O—, or —S—;    -   m is an integer from 1 or 2;    -   n is an integer from 0, 1, or 2;    -   p is an integer from 1 or 2,    -   wherein m +n +p is less than or equal to 6;    -   r is an integer from 0, 1, 2, or 3;    -   u is an integer from 0, 1, 2, 3, or 4; and    -   v is an integer from 0, 1, 2, 3, or 4.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereind is an integer from 0 to 50, preferably from 0 to 30.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinR″ comprises a hydrophilic functional group.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereind is an integer from 0 to 15, preferably from 0 to 10, more preferablyfrom 3 to 6.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinC′ is a compound of Formula I.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula III or a compound of Formula IV, whereinQ is a heteroatom, preferably —NH—.

In another aspect, the invention provides a cell functionalizing probecomprises a compound of Formula Ma, a compound of Formula Mb, a compoundof Formula Mc, a compound of Formula Ind, a compound of Formula IVa, acompound of Formula IVb, a compound of Formula IVc, or a compound ofFormula IVd:

wherein:

-   -   A′ is

-   -   B′ is —H or

-   -   L is a linker comprising (CH₂CH₂O)_(d), and d is an integer from        0 to 50;    -   g is an integer from 0, 1, 2 or 3;    -   R′ is —H, —X, CH₃, or CX₃, wherein X is —F, —Cl, —Br, or —I;    -   R″ is aryl or C₁₋₃alkylaryl; and    -   R¹⁰ is —CO(CH₂)_(i)NHCO—, wherein i is an integer from 0, 1, 2,        3, or 4;    -   R¹¹ is —H, C1-3alkyl, optionally substituted with halogen;    -   R¹² is each independently a hydrogen, alkyl, —OH, alkoxy, amino,        ester, —O-L-R¹³;    -   R¹³ is an alkyl, hydroxyl, alkoxy, amino, or ester;    -   R¹⁴ is each independently —H, —OH, alkoxy, —COOH, —COC₁₋₃alkyl,        —COH, amino, and L-O—R¹⁵;    -   R¹⁵ is —H or alkyl;    -   Z₁, Z₂, and Z₃ are each independently —CH₂—, —O—, —S—, or —N—;    -   Q is —NH—, —O—, or —S—;    -   m is an integer from 1 or 2;    -   n is an integer from 0, 1, or 2;    -   p is an integer from 1 or 2,    -   wherein m+n+p is less than or equal to 6;    -   r is an integer from 0, 1, 2, or 3;    -   u is an integer from 0, 1, 2, 3, or 4; and    -   v is an integer from 0, 1, 2, 3, or 4.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formulae IIIa, IIIb, IIIc, IIId, IVa, IVb, IVc,or IVd, wherein d is an integer from 0 to 50, preferably from 0 to 30,more preferably from 0 to 15, from 0 to 10, or from 3 to 6.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formulae IIIa, IIIb, IIIc, IIId, IVa, IVb, IVc,or IVd, wherein n is 0 or 1.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formulae IIIa, IIIb, IIIc, IIId, IVa, IVb, IVc,or IVd, wherein Q is a heteroatom, preferably —NH—.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formulae IIIa, IIIb, IIIc, IIId, IVa, IVb, IVc,or IVd, wherein R¹⁴ comprises a hydrophilic functional group, e.g., —OH,PEG, —CO—, —NH—.

In an embodiment, the invention provides a cell functionalizing probecomprises a compound of Formulae IIIa, IIIb, IIIc, IIId, IVa, IVb, IVc,or IVd, wherein R′ is —CF₃.

In another aspect, the invention provides a cell functionalizing probethat is

In another aspect, the invention provides a cell functionalizing probethat is

As used herein, the term “alkyl” is meant to refer to a saturatedhydrocarbon group which is straight-chained or branched. Example alkylgroups include methyl (Me), ethyl (Et), propyl (e.g., n-propyl andisopropyl), butyl (e.g., n-butyl, isobutyl, t-butyl), pentyl (e.g.,n-pentyl, isopentyl, neopentyl), and the like. An alkyl group cancontain from 1 to about 20, from 2 to about 20, from 1 to about 10, from1 to about 8, from 1 to about 6, from 1 to about 4, or from 1 to about 3carbon atoms.

As used herein, “alkene” or “alkenyl” refers to an alkyl group havingone or more double carbon-carbon bonds.

As used herein, “strained alkene” refers to a ring structure having oneor more carbon-carbon double bonds.

As used herein, “alkyne” or “alkynyl” refers to an alkyl group havingone or more triple carbon-carbon bonds.

As used herein, “strained alkyne” refers to a ring structure having oneor more carbon-carbon triple bonds.

As used herein, “aryl” refers to monocyclic or polycyclic (e.g., having2, 3 or 4 fused rings) aromatic hydrocarbons such as, for example,phenyl, naphthyl, anthracenyl, phenanthrenyl, indanyl, indenyl, and thelike. In some embodiments, aryl groups have from 6 to about 20 carbonatoms.

As used herein, “halo” or “halogen” includes fluoro, chloro, bromo, andiodo.

As used herein, “alkoxy” refers to an —O-alkyl group. Example alkoxygroups include methoxy, ethoxy, propoxy (e.g., n-propoxy andisopropoxy), t-butoxy, and the like.

As used herein, “aralkyl ” refers to an alkyl group substituted by anaryl group.

As used herein, a bond substitution coming out of a ring, e.g.,

means that the substitution can be at any of the available position onthe ring.

The cell functionalizing probe or cell functionalizing barcoded tag ofthe invention can be asymmetric (e.g., having one or morestereocenters). The description of a probe or tag without specifyingspecifying its stereochemistry is intended to capture mixtures ofstereoisomers as well as each of the individual stereoisomer encompassedwithin the genus.

As used herein, “affinity reagent” is an antibody, peptide, nucleicacid, or other small molecule that specifically binds to a larger targetmolecule in order to identify, track, capture, or influence itsactivity.

As used herein, “space barcode” and “spatial barcode” are usedinterchangeably.

As used herein, “cell functionalizing barcoded tag” or “oligo tag” or“barcoded tag” are used interchangeably.

It is an object of the invention not to encompass within the inventionany previously known product, process of making the product, or methodof using the product such that Applicants reserve the right and herebydisclose a disclaimer of any previously known product, process, ormethod. It is further noted that the invention does not intend toencompass within the scope of the invention any product, process, ormaking of the product or method of using the product, which does notmeet the written description and enablement requirements of the USPTO(35 U.S.C. §112, first paragraph) or the EPO (Article 83 of the EPC),such that Applicants reserve the right and hereby disclose a disclaimerof any previously described product, process of making the product, ormethod of using the product. It may be advantageous in the practice ofthe invention to be in compliance with Art. 53(c) EPC and Rule 28(b) and(c) EPC. Nothing herein is to be construed as a promise.

It is noted that in this disclosure and particularly in the claimsand/or paragraphs, terms such as “comprises”, “comprised”, “comprising”and the like can have the meaning attributed to it in U.S. Patent law;e.g., they can mean “includes”, “included”, “including”, and the like;and that terms such as “consisting essentially of” and “consistsessentially of” have the meaning ascribed to them in U.S. Patent law,e.g., they allow for elements not explicitly recited, but excludeelements that are found in the prior art or that affect a basic or novelcharacteristic of the invention.

These and other embodiments are disclosed or are obvious from andencompassed by, the following Detailed Description.

BRIEF DESCRIPTION OF THE DRAWINGS

The following detailed description, given by way of example, but notintended to limit the invention solely to the specific embodimentsdescribed, may best be understood in conjunction with the accompanyingdrawings.

FIG. 1 illustrates a method to optically tag cells within microscopicsub-regions of a cellular ensemble prior to scRNA-Seq.

FIG. 2 illustrates to optically tag cells within a region of interest inorder to spatially barcode live tissue smaples. Once the cells have beentagged, the sample is cellularized, sorted to enrich for fluorescenttags, since cell sequences are profiled, and reconstructmicroenvironmental neighbors by amplification of shared spatialbarcodes, which are unique to each region of interest by a computationalmethod.

FIG. 3 illustrates a scheme for the synthesis of a photocagedfluorescein molecule.

FIG. 4 illustrates a scheme for the synthesis of the two fragments, aniodobenzene and ethynylbenzene, towards the synthesis of a strainedalkene.

FIG. 5 illustrates the coupling steps with both fragments to afford astrained alkene.

FIG. 6 illustrates the scheme for the synthesis of the cellfunctionalizing tissue probe.

FIG. 7 illustrates a scheme for the synthesis of a thienoimidazolone(SC-5), and SC-3, component/fragment of the target molecule. The SC-3reaction is initiated with SC-1 PPh3 and 12 to form intermediate SC-2that is further reacted with tetraethylene glycol and NaH to formcomponent/fragment SC-3. SC-4 is reacted to form component/fragmentSC-5.

FIG. 8 illustrates a scheme for the synthesis of a tetraethylene glycolderivatives of a thienoimidazolone, SC-7 and SC-8, as well as thesynthesis of (SC-6) component/fragment of the target molecule. The SC-6is synthesized from the reaction of SC-3, PPh3 and 12. The SC-8 reactionis initiated first with SC-5,MsCl, and NET_(S) followed by NaI andacetone then tetraethylene glycol and NaH to form product SC-7. Areaction of first SC-7, MsCl, and Neta followed by NaI and acetone toarrive at SC-8.

FIG. 9 illustrates a scheme for the synthesis of “SpaceCat” with atriazene core (SC-12 and SC-13). The assembly of SpaceCat begins withthe reaction of the component/fragment SC-3, NaH, and the triazene coreto form SC-12. The partiallty assembeled “SpaceCat” SC-12 is reactedwith NaH and the component/fragment SC-7 to form SC-13.

FIG. 10 illustrates the derivatization of the triazene core with acyclooctyne, a thienoimidazolone, and a diazirine (SC-14).

FIG. 11 illustrates a scheme for the synthesis of “SpaceCat” with anaromatic core (SC-11). First, 3,5-Dihydroxybenzaldehyde is reacted withK₂CO₃ and SC-6 to form SC-9. Partially assembled “SpaceCat” SC-10 issynthesized from the reaction of SC-9 with K2CO3 and SC-8. The finaltarget compound with an aromatic core (SC-11) is synthesized from thereaction of SC-10, cyclooctyne.NH₂, NaBH(OAc)₃ and NaBH₃CN.

FIG. 12 illustrates the final target compound with an aromatic core(SC-11).

FIG. 13A is a schematic detailing the motivations behind designing“SpaceCat”. Left: illustration of classic single cell RNA-Sequencing,with large tissue structures combined during tissue dissociation, withloss of locational information. Center: Illustration of pilot examples,with fine dissection tools to study regionality within the“macro-environment”. Right: Illustration of ideal data structure withour protocol, enabling high resolution structural information retainedthrough tissue dissociation, and subsequent single-cell RNA-Sequencing.

FIG. 13B shows pilot data on the “macro-environment (˜10^4)”. As a proofof concept, a large (˜2.5 mm^3) MC38 tumor from a mouse model wasdissected into 3 isolates based on location (annotated in each plot). Tcells, macrophages, and tumor cells we FACS sorted and single-cellRNA-Sequencing was performed. The data were illustrated using aprojection of the top principal components regional differences betweeneach section.

FIG. 14 illustrates an atlas of cellular phenotypes in SHIV from anon-human primate across a subset of different tissue types.

FIG. 15 illustrates representative data from a non-human primatenecropsy study. A t-stochastic neighbor embedded (t-SNE) plot of thehighly variant genes is presented here, to illustrate non-equivalence ofthe distinct tissues. Even functionally and anatomically similartissues, such as the Iliac lymph node and the Submandibular lymph node,are composed of differing frequencies of cell types and thus showdiffering projections into tSNE space. As illustrated in the lowerpanels, three secondary lymphoid tissues exhibit large variation in thefrequency of T cells (as determined by CD3 delta chain), where darkershades indicate greater levels of expression.

FIG. 16 demonstrates the principle that unique cellular phenotypesemerge based on tissue compartment (and therefore, local regionaleffects). On the left panel, a tSNE plot illustrates that the dominnatfactor in cell-cell variability is the cell type (e.g. neutrophilscluster with neutrophils, lymphocytes cluster separately. On the rightpanel, a principal components analysis plot shows that indeed the tissuecompartments are a major source of variability between cells of the“same type” (here: activated neutrophils).

FIG. 17 shows two heatmaps detailing the origin of the variabilitybetween blood-derived neutrophils and sputum-derived neutrophils.

FIG. 18A illustrates the different cell types present in the MC38 tumor.T cells, Macrophages, and Tumor cells were FACS sorted.

FIG. 18B is a photograph of the dissected tumor, with masks to defineregions.

FIG. 18C are PCA plots as in FIG. 18A, divided by the region of origin.

FIG. 19 shows heatmaps describing the regional differences identified inlike cell types (between macrophages, between tumor cells) based onlocation in the original solid structure. On the left, genes that aresignificantly differentially expressed in the macrophages for eachsection are plotted (e.g. high in Section 1, high in Section 2, high inSection 3), and the common pathways that those genes represent aredetailed in the margins. The same analysis is completed for the tumorcells on the right.

FIG. 20 T cells cluster over exhaustion markers. Single cells are scoredby their expression of canonical markers of exhaustion, and subsequentlyclustered by similarity. As illustrated, certain regions of the tumorcontain cells with exhaustion phenotypes that are similar to each other(within a microenvironment) yet are distinct from other cells in distantregions.

FIG. 21A illustrates that cells in the center of a tumor structureexhibit a strong signature for hypoxia. FIG. 21B shows that the T cellsthat segregate between different tumor regions express differentinterferon signaling pathway components, and at different magnitudes.This indicates immunity is regionally confined and reacts differentlydepending on the local influences.

FIG. 22 is a schematic of the computational analysis required forspatial tags on single cells. Here, single cells are represented in eachrow of the graph, and height (blue) of each peak represents the sum ofreads that align to certain locations on the gene (such that a highlyexpressed gene will have many genes that “pile-up” over their exons,outlined along the bottom row, e.g. “Gene A”). On the right, Applicantsillustrate how DNA barcodes will accompany single cells based on theirlocation in space. This will individually assign single cells to commonlocation in the native tissue, as read out from sequencing data.

FIG. 23 presents alternative, parallel and supplementary schemes foridentify single cells by their spatial configuration in a native tissue.On the left, Applicants illustrate the mechanism of photo-uncaging,wherein a certain wavelength of light mediates uncaging, and thereforefluorescence of a molecule that can be identified and cells can besorted based on this tag. On the right, Applicants illustrate a methodwherein specific DNA barcodes are printed onto a surface, such that theycan react with moieties on the surface of functionalized single cells,tagging the single cell location in space.

FIG. 24 illustrates the use of NVOC-caged calcein dye as alight-dependent uncaging system to identify cells based on theirlocation. Two fields of view (FOV) are shown, in which a set of cells isidentified as viable by DAPI staining in the “pre-activation” image, andthen are subsequently exposed to light that enables calcein to beun-caged and fluoresce inside of viable cells. The “post activation”images are taken shortly thereafter, illustrating that the center of theFOV where uncaging wavelengths have been directed show new fluorescencein the calcein channel. Bleaching in the DAPI channel in thepost-activation images is also observed.

FIG. 25 illustrates different examples of a cell functionalizingbarcoded tag (“oligo” or oligonucleotide tag).

FIG. 26 illustrates examples of oligo constructs for patterningsurfaces.

DETAILED DESCRIPTION OF THE INVENTION

The invention comprehends a method to develop and leverage a novelplatform for spatially-resolved single cell genomic profiling of tissuesto identify novel immune evasion genes from intratumoral heterogeneity.This method provides an integrated, single platform to measure changesin gene expression directly from single cells. The connection betweencells and their neighbors is lost when averaging across a population(e.g., a large surface of cells) or disaggregating tissues forscRNA-Seq. The present application provides a spatially—resolvedscRNA-Seq approach to link neighboring cell states/phenotypes and use itto propose regulators of intracellular circuits, which be validated andscreened for clinical relevance.

Applicants developed a platform for optically tagging cells withinmicroscopic sub-regions of a cellular ensemble (e.g., a tissue slidefrom a tumor) prior to population-level RNA-Seq (or genomic, proteomic,lipidomic or other analyte profiling) or scRNA-Seq (or single-cell 'omicprofiling), overcoming current limitations on spatial resolution and/orthroughput. The invention comprehends a direct, real-time imaging andlabeling on live biopsy, whole tissue resections, or any imageablecellular surface or biological surface or organic surface. Applicantssynthesized a biorthogonally reactive molecule as illustrated in FIG. 6(e.g., cell functionalizing probe “CFP”) which, in the present example,is photoactivated to generate a reactive intermediate. Upon opticalactivation, the cell functionalizing probe tags a region of interest(“ROI”) in a live tissue section via, for example, rapid C—H bondinsertion. After washing, a fluorescent oligonucleotide (e.g., spatialbarcode oligonucleotide “SBO”) or other spatial tag is covalently linkedvia a bioorthogonal reaction. The Smart-Seq2-compatible oligonucleotideis retained through dissociation and FACS enrichment, thus enabling theassignment of cells to their imaged ROI.

In an aspect of the invention, the platform contains an automatedreagent delivery system and DLP-enabled optical control for tagging manyROIs. The platform, in some aspects, can be coupled to a microscope toallow for real-time cell visualization, and thus quantification, ofadditional optical variables of interest including, but not limited to,morphology, fluorescent protein or antibody levels, etc.

The invention comprehends a method applicable to any tissue or cellensemble structure enabling one of skill in the art to answer novelbiological inquiries in any system without the requirement oftransfection or genetic engineering. For example, the platform profiles,in a spatially-resolved fashion, mouse Braf/Pten melanoma and MC38 coloncarcinoma tumor sections. Applicants are able to identify heterogeneityin both immune cell state/behavior and number with each geographicalregion, and then correlate those parameters with the gene expressionprograms observed in tumor cells found in the same region. Further, theinvention provides a method to study the functional properties of adefined (by photoactivation, during cell sorting or prior to patterning)set of cells selected from a heterogeneous population. Functionalproperties of a tissue or cell ensemble arise through interactions of avariety of cells where structure in its environment (e.g., extracellularmatrix) provides organization for the exchange of chemical, electrical,and mechanical information between neighboring and distant cells(Todhunter, et al. Nature Methods 2015, vol. 12(10), 975). Thus, in anaspect of the invention, Applicants reconstruct information provided bythe structure of tissue by determining the geographical region of thecells within a tissue and linking those regions to different phenotypes(e.g., immune cell behaviors.) Applicants use principle componentanalysis, computational methods, or any other linear or nonlineardimensionality reduction technique and are able to identify or predictsubpopulations of cells with a phenotype. The method may also “learn” orgenerate predictive cell-cell relationships or interactions. Bymeasuring complex gene expression profiles at the single cell level,Applicants identify subsets of cells with expression patterns ofinterest that could not be detected when analyzing the “average” geneexpression of the heterogeneous population. More generally, the methodenvisions the analysis of phenotypes of one or more cells informing theanalysis of other co-localized cells, either within a specific spatiallocation or between spatial locations.

It is to be understood that this invention is not limited to particularmethods, components, products or combinations described, as suchmethods, components, products and combinations may, of course, vary. Itis also to be understood that the terminology used herein is notintended to be limiting, since the scope of the present invention willbe limited only by the appended claims. Preferred features andembodiments of this invention are set forth herein, including by way ofnumbered statements. Each feature and embodiment of the invention sodiscussed herein may be combined with any other feature and/orembodiments unless clearly indicated to the contrary. In particular, anyfeature indicated as being preferred or advantageous may be combinedwith any other feature or features indicated as being preferred oradvantageous. Aspects of the present invention include any one or anycombination of one or more of the aspects, features or embodimentsdiscussed herein, including as enumerated herein 1 to 74, with any otherstatement and/or embodiments. The present invention is discussed withrespect to particular embodiments but the invention is not limitedthereto but only by the claims. Any reference signs in the claims shallnot be construed as limiting the scope. As also discussed herein, theterm “comprising” does not exclude other elements or steps. The terms“comprising”, “comprises” and “comprised of” as used herein aresynonymous with “including”, “includes” or “containing”, “contains”, andare inclusive or open-ended and do not exclude additional, non-recitedmembers, elements or method steps. It will be appreciated that the terms“comprising”, “comprises” and “comprised of” as used herein comprise theterms “consisting of”, “consists” and “consists of”, as well as theterms “consisting essentially of”, “consists essentially” and “consistsessentially of”. “Consisting essentially of” permits inclusion ofadditional components not listed, provided that they do not materiallyaffect the basic and novel properties of the invention. Singular terms,e.g., “a”, “an”, and “the” include both singular and plural referentsunless the context clearly dictates otherwise. The recitation ofnumerical ranges by endpoints includes all numbers and fractionssubsumed within the respective ranges, as well as the recited endpoints.The term “about” or “approximately” as used herein when referring to ameasurable value such as a parameter, an amount, a temporal duration,and the like, is meant to encompass variations of +/−20% or less,preferably +/−10% or less, more preferably +/−5% or less, and still morepreferably +/−1% or less of and from the specified value, insofar suchvariations are appropriate to perform in the disclosed invention. It isto be understood that the value to which the modifier “about” or“approximately” refers is itself also specifically, and preferably,disclosed. Whereas the terms “one or more” or “at least one”, such asone or more or at least one member(s) of a group of members, is clearper se, by means of further exemplification, the term encompasses interalfa a reference to any one of said members, or to any two or more ofsaid members, such as, e.g., any >3, >4, >5, >6 or >7 etc. of saidmembers, and up to all said members. All references cited in the presentspecification are hereby incorporated by reference in their entirety. Inparticular, the teachings of all references herein specifically referredto are incorporated by reference. Unless otherwise defined, all termsused in disclosing the invention, including technical and scientificterms, have the meaning as commonly understood by one of ordinary skillin the art to which this invention belongs. By means of furtherguidance, term definitions are included to better appreciate theteaching of the present invention.

Reference throughout this specification to “one embodiment” or “anembodiment” means that a particular feature, structure or characteristicdescribed in connection with the embodiment is included in at least oneembodiment of the present invention. Thus, appearances of the phrases“in one embodiment” or “in an embodiment” in various places throughoutthis specification are not necessarily all referring to the sameembodiment, but may. Furthermore, the particular features, structures orcharacteristics may be combined in any suitable manner, as would beapparent to a person skilled in the art from this disclosure, in one ormore embodiments. Furthermore, while some embodiments described hereininclude some but not other features included in other embodiments,combinations of features of different embodiments are meant to be withinthe scope of the invention, and form different embodiments, as would beunderstood by those in the art. For example, in the appended claims, anyof the claimed embodiments can be used in any combination. Any drawingsherewith form a part of this specification, and are provided as a way ofillustration only of specific embodiments in which the invention may bepracticed; but, it is to be understood that other embodiments may beutilized and structural or logical changes may be made without departingfrom the scope of the present invention. Accordingly, the hereindetailed description is not to be taken in a limiting sense, and thescope of the present invention is defined by the appended claims.

The following terms or definitions are provided solely to aid in theunderstanding of the invention. Unless specifically defined herein, allterms used herein have the same meaning as they would to one skilled inthe art of the present invention. Practitioners are particularlydirected to Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nded., Cold Spring Harbor Press, Plainsview, New York (1989); and Ausubelet al., Current Protocols in Molecular Biology (Supplement 47), JohnWiley & Sons, New York (1999), Innis et al., PCR Protocols: A Guide toMethods and Applications, Academic Press: San Diego, 1990. Generalprinciples of microbiology are set forth, for example, in Davis, B. D.et al., Microbiology, 3rd edition, Harper & Row, publishers,Philadelphia, Pa. (1980), for definitions and terms of the art. Thedefinitions provided herein should not be construed to have a scope lessthan understood by a person of ordinary skill in the art.

Unless indicated otherwise, all methods, steps, techniques andmanipulations that are not specifically described in detail can beperformed and have been performed in a manner known per se, as will beclear to the skilled person. Reference is for example again made to thestandard handbooks, to the general background art referred to above andto the further references cited therein. Accordingly, the invention canbe practiced without undue experimentation by way of the hereindisclosure taken in conjunction with knowledge in the art.

The present invention provides tools and methods for the systematicanalysis of genomic interactions between cells, in particular immunecell subpopulations, including higher order interactions.

The present invention provides tools and methods for combinatorialprobing of cellular circuits, for dissecting cellular circuitry, fordelineating molecular pathways, and/or for identifying relevant targetsfor therapeutics development.

The present invention in certain embodiments relates to analyzinggenetic signatures of immune cells, such as molecular profiling at thesingle cell or cell (sub)population level, which immune cells arecharacterized by or characteristic of a particular immune responderphenotype.

In an aspect, the invention relates to a method of identifying an immunecell gene signature, protein signature, and/or other genetic orepigenetic signature associated with a specific immune responderphenotype or an immune cell subpopulation associated with a specificimmune responder phenotype, comprising:

comparing single cell or cell population RNA and/or protein expressionprofiles and/or other genetic or epigenetic profiles of a biologicalsample of said specific immune responder phenotype with single cell orcell population RNA and/or protein expression profiles and/or othergenetic or epigenetic profiles of a biological sample of a differentimmune responder phenotype or a different an immune cell subpopulationassociated with said specific immune responder phenotype;

-   -   determining differentially expressed RNAs and/or proteins and/or        other genetic or epigenetic elements;    -   determining an immune cell gene signature, protein signature,        and/or other genetic or epigenetic signature associated with a        specific immune responder phenotype or an immune cell        subpopulation associated with a specific immune responder        phenotype as one or more of said differentially expressed RNAs        and/or proteins and/or other genetic or epigenetic elements.

Such method also in particular allows to identify particular immune cellsubpopulations which are specifically associated with a particularimmune responder phenotype, as well as to identify a particular immuneresponder phenotype, based on detection of such gene signatures, proteinsignature, and/or other genetic or epigenetic signature.

All methods according to various aspect and embodiments of the inventionmay involve analyzing gene signatures, protein signature, and/or othergenetic or epigenetic signature or (immune cell) phenotypes based onsingle cell analyses or alternatively based on cell population analyses.

In related aspects, the invention relates to gene signatures, proteinsignature, and/or other genetic or epigenetic signature of immune cellsassociated with particular immune responder phenotypes, such as forinstance particular immune cell subpopulations. The invention furtherrelates to particular immune cell subpopulations, which may beidentified based on the methods according to the invention as discussedherein; as well as methods to obtain such cell (sub)populations andscreening methods to identify immunomodulators capable of inducing orsuppressing particular immune cell (sub)populations, such as forinstance to alter immune cell population composition. Methods asdescribed herein allow for instance in certain aspects the specific(partial) induction or (partial) depletion of particular immune cellsubpopulation, such as to alter for instance an immune responderphenotype, which may in certain embodiments be defined by particularimmune cell (sub)population compositions (e.g. different immune cellsubpopulations characterized by specific immune cell states).

The invention further relates to various uses of the gene signatures,protein signature, and/or other genetic or epigenetic signature asdefined herein, as well as various uses of the cells or cell(sub)populations as defined herein. Particular advantageous uses includemethods for identifying modulators based on the gene signatures, proteinsignature, and/or other genetic or epigenetic signature as definedherein. In an aspect, the invention hereto provides for a method ofidentifying a modulant capable of modulating, such as inducing oralternatively suppressing, a specific responder phenotype having aspecific gene signature, protein signature, and/or other genetic orepigenetic signature, comprising: applying a candidate modulant to acell or a population of cells and identifying a modulant capable ofinducing or alternatively suppressing a specific responder phenotype ifsaid specific gene signature, protein signature, and/or other genetic orepigenetic signature is induced or alternatively repressed in one ormore of said cells. The invention further relates to modulators capableof modulating, such as inducing or repressing, a particular responderphenotype or a specific gene signature, protein signature, and/or othergenetic or epigenetic signature, as well as their use for modulating,such as inducing or repressing, a particular responder phenotype, or aparticular gene signature, protein signature, and/or other genetic orepigenetic signature. Such modulation may include for instance specificinduction or alternatively specific reduction of particular cells, orcell (sub)populations.

In further related aspects, the invention relates to diagnostic(including monitoring the status of a subject), prognostic (includingmonitoring treatment efficacy), prophylactic, or therapeutic methods.Diagnostic or prognostic methods according to the invention inparticular may comprise detecting the gene signatures, proteinsignature, and/or other genetic or epigenetic signature as discussedherein. Therapeutic or prophylactic methods according to the inventionin particular may comprise modulating the responder phenotype, and mayinclude modulating the gene signature, protein signature, and/or othergenetic or epigenetic signature of cells or cell (sub)populations. Suchmethods include both in vitro as well as in vivo modulation.

As used herein, the term “gene signature” may be used interchangeablywith the term “signature gene”. These terms relate to one or more gene(or one or more particular splice variants thereof), the (increased)expression or activity of which or alternatively the decreased orabsence of expression or activity of which is characteristic for aparticular (multi)cellular phenotype, i.e. the occurrence of suchparticular (multi)cellular phenotype may be identified based on thepresence or absence of such gene signature. The signature may thus becharacteristic of a particular phenotype, but may also be characteristicof a particular immune cell subpopulation within a particular phenotype.Similarly, an “epigenetic signature” relates to one or more epigeneticelement (or modification), the (increased) occurrence of which oralternatively the absence of which is characteristic for a particular(multi)cellular phenotype, i.e. the occurrence of such particular(multi)cellular phenotype may be identified based on the presence orabsence of such epigenetic signature. As used herein a signatureencompasses any gene or genes or epigenetic element(s) whose expressionprofile or whose occurrence is associated with a specific cell type,subtype, or cell state of a specific cell type or subtype within apopulation of cells. Increased or decreased expression or activity orprevalence may be compared between different phenotypes in order tocharacterize or identify specific phenotypes. A gene signature as usedherein, may thus refer to any set of up- and down-regulated genesbetween two (multi)cellular states or phenotypes derived from agene-expression profile. For example, a gene signature may comprise alist of genes differentially expressed in a distinction of interest;(e.g., high responders versus low responders; diseased state versusnormal state; etc.). Similarly, an epigenetic signature as used herein,may thus refer to any set of induced or repressed epigenetic elementsbetween two (multi)cellular states or phenotypes derived from anepigenetic profile. For example, an epigenetic signature may comprise alist of epigenetic elements differentially present in a distinction ofinterest; (e.g., high responders versus low responders; diseased stateversus normal state; etc.). It is to be understood that also whenreferring to proteins (e.g. differentially expressed proteins), such mayfall within the definition of “gene” signature, and may on certainoccasions be referred to as “protein signature”.

The signature as defined herein (being it a gene signature, proteinsignature or other genetic or epigenetic signature) can be used toindicate the presence of a cell type, a subtype of the cell type, thestate of the microenvironment of a population of cells, a particularcell type population or subpopulation, and/or the overall status of theentire cell (sub)population. Furthermore, the signature may beindicative of cells within a population of cells in vivo. The signaturemay also be used to suggest for instance particular therapies, or tofollow up treatment, or to suggest ways to modulate cell systems. Thesignatures of the present invention may be discovered by analysis ofexpression profiles of single-cells within a population of cells fromisolated samples (e.g. blood samples), thus allowing the discovery ofnovel cell subtypes or cell states that were previously invisible orunrecognized. The presence of subtypes or cell states may be determinedby subtype specific or cell state specific signatures. The presence ofthese specific cell (sub)types or cell states may be determined byapplying the signature genes to bulk sequencing data in a sample. Notbeing bound by a theory, the signatures of the present invention may bemicroenvironment specific, such as their expression in a particularspatio-temporal context. Not being bound by a theory, signatures asdiscussed herein are specific to a particular pathological context (e.g.infection, cancer, autoimmune disease, allergy). Not being bound by atheory, a combination of cells having a particular signature mayindicate an outcome. Not being bound by a theory, a spatial pattern canbe used to deconvolute the network of cells present in a particularcondition (healthy or pathological). Not being bound by a theory, thesignatures can be used to deconvolute the network of cells present in aparticular pathological condition. Not being bound by a theory thepresence of specific cells and cell subtypes are indicative of aparticular response to treatment, such as including increased ordecreased susceptibility to treatment. The signature may indicate thepresence of one particular cell type. In one embodiment, the novelsignatures are used to detect multiple cell states that occur in asubpopulation of cells that are linked to particular pathologicalcondition, or linked to a particular outcome or progression of apathological condition, or linked to a particular response to treatmentof a pathological condition.

The signature according to certain embodiments of the present inventionmay comprise or consist of one or more genes, proteins and/or epigeneticelements, such as for instance 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or more. Itis to be understood that a signature according to the invention may forinstance also include genes or proteins as well as epigenetic elementscombined.

In certain embodiments, a signature is characterized as being specificfor a particular responder phenotype or specific for a particular cellor cell (sub)population if it is only present, detected or detectable inthat particular responder phenotype or specific for a particular cell orcell (sub)population. In this context, a signature consists of one ormore differentially expressed genes/proteins or differential epigeneticelements when comparing different immune responder phenotypes ordifferent immune cells or immune cell (sub)populations. It is to beunderstood that “differentially expressed” genes/proteins includegenes/proteins which are up- or down-regulated as well as genes/proteinswhich are turned on or off. When referring to up-or down-regulation, incertain embodiments, such up- or down-regulation is preferably at leasttwo-fold, such as two-fold, three-fold, four-fold, five-fold, or more,such as for instance at least ten-fold, at least 20-fold, at least30-fold, at least 40-fold, at least 50-fold, or more. Alternatively, orin addition, differential expression may be determined based on commonstatistical tests, as is known in the art.

As discussed herein, differentially expressed genes/proteins, ordifferential epigenetic elements may be differentially expressed on asingle cell level, or may be differentially expressed on a cellpopulation level. Preferably, the differentially expressed genes/proteins or epigenetic elements as discussed herein, such asconstituting the gene signatures as discussed herein, when as to thecell population level, refer to genes that are differentially expressedin all or substantially all cells of the population (such as at least80%, preferably at least 90%, such as at least 95% of the individualcells). This allows one to define a particular subpopulation of cells.As referred to herein, a “subpopulation” of cells preferably refers to aparticular subset of cells of a particular cell type which can bedistinguished or are uniquely identifiable and set apart from othercells of this cell type. The cell subpopulation may be phenotypicallycharacterized, and is preferably characterized by the signature asdiscussed herein. A cell (sub)population as referred to herein mayconstitute of a (sub)population of cells of a particular cell typecharacterized by a specific cell state.

In one embodiment, the cells are detected by immunofluorescence, by masscytometry (CyTOF), FACS, atac-seq, in situ hybridization, etc. Othermethods including absorbance assays and colorimetric assays are known inthe art and may be used herein.

When referring to induction, or alternatively suppression of aparticular signature, preferable is meant induction or alternativelysuppression (or upregulation or downregulation) of at least onegene/protein and/or epigenetic element of the signature, such as forinstance at least to, at least three, at least four, at least five, atleast six, or all genes/proteins and/or epigenetic elements of thesignature.

Signatures may be functionally validated as being uniquely associatedwith a particular phenotype, such as a giving a preferred response(“responder phenotype”). Induction or suppression of a particularsignature may consequentially associated with or causally drive aparticular responder phenotype.

As used herein, the term “responder phenotype” may be usedinterchangeably with “response phenotype”. These terms refer to anindividual characterized by a specific immune response towards apathological insult. By extension, these terms also refer to organs,tissues, cells, or cell (sub)populations of such individuals, includingimmune cells. By means of example, a specific response may constitute animproved or vigorous immune response or alternatively a poor immuneresponse; a fast immune response or alternatively a slow immuneresponse; an immune response characterized by for instance a specificcytokine profile or a specific sequence of succession of cytokineexpression; etc. without limitation, for instance in the context ofviral infection.

As used herein, the term “immune responder phenotype” may be usedinterchangeably with “immune response phenotype”. These terms refer toan individual characterized by a specific immune response towards apathological insult, or by a particular immunological state. Byextension, these terms also refer to organs, tissues, cells, or cell(sub)populations of such individuals, including immune cells. By meansof example, a specific immune response may constitute an improved orvigorous immune response or alternatively a poor immune response; a fastimmune response or alternatively a slow immune response; an immuneresponse characterized by for instance a specific cytokine profile or aspecific sequence of succession of cytokine expression; etc. withoutlimitation, for instance in the context of viral infection, such as forinstance HIV infection, the immune responder phenotype may be an elitecontroller. Elite controller is a term applied to the rare group ofHIV-positive individuals who maintain substantially undetectable viralloads in the absence of any treatment. Although genetic variability inthe HLA locus and enhanced CD8 T cell immunity have been proposed to besome of the causes of the spontaneous immunological control of HIV-1 inthis cohort, cellular and molecular mechanisms responsible for the elitecontroller phenotype are not fully understood. An elite controller mayfor instance be defined as having consecutive undetectable HIV-RNAmeasurements for more than six months or otherwise with at least 90% ofmeasurements having less than 400 copies/ml over at least 10 years.Other immune responder phenotypes for instance include long term nonprogressors (LTNP), slow progressors, HIV controllers (HICs), viremiccontrollers, noncontrollers, and rapid progressors. A progressivecontroller has roughly about 100 copies of HIV while a viremiccontroller has full blown AIDS and declining health. Without wishing tobe bound by any one particular theory, it is believed that someone whois a viremic controller was formerly an elite or progressive controller.The present invention thus relates to examining signatures for an elitecontroller, a progressive controller and a viremic individual. For aviremic controller and potentially a progressive controller, in certainembodiments it is desirable to modify the signatures of that individualby modulating, such as perturbing the system such that the signatureresembles that of an elite controller. Other immune responder phenotypesfor instance include phenotypes based on neutralizing antibody breadth,such as a phenotype characterized by broadly neutralizing antibodies.Broadly neutralizing antibodies are neutralizing antibodies (Nab), whichare antibodies which defend a cell from an antigen or infectious body byinhibiting or neutralizing any effect it has biologically. Broadlyneutralizing antibodies are neutralizing antibodies which for instanceare capable of neutralizing multiple disease strains, such as multipleHIV strains.

As used herein “co-occurant” may be used to indicate the temporalproperty of two things happening at the same time; events belonging atthe same time, or being observed at the same time.

In certain embodiments, the immune responder phenotype is characteristicof, associated with, or correlated with a particular (immune) responseto a pathological condition.

A “pathological condition” as referred to herein includes anyphysiologically abnormal condition of an organism, which results indamage of or harm to the organism. A pathological condition as referredto herein is in particular associated with or causally related to animmunological response of the host organism, such as for instance anenhanced or improved immunological response or alternatively a decreasedor reduced immunological response. The type of immunological response ofan individual to a pathological condition may characterize the immuneresponder phenotype. The immunological response may be compared betweenindividuals each of which being afflicted with the pathologicalcondition, thereby allowing differentiation between or identification ofdifferent immune responder phenotypes, or alternatively may be comparedbetween individuals afflicted with the pathological condition andindividuals not afflicted with the pathological condition.

In certain embodiments, the pathological condition as referred to hereinis an infection, autoimmune disease, allergy, or cancer. It is to beunderstood, that in aspects and embodiments wherein reference is made toprophylaxis, such means that the pathological condition referred to isto be prevented, such as for instance the prevention of infection,autoimmune disease, allergy, or cancer. More general, the pathologicalcondition as referred to herein may include any pathological conditionin which the immune system is involved and/or the immune system reactsabnormally or inappropriately, and may for instance also include graftversus host disease.

In certain embodiments, infection is due to bacteria, virus, protozoa,parasite, or fungus.

In certain embodiments, infection is a bacterial infection. In certainembodiments, the bacterial infection is infection due to Bacillus sp.(e.g. Bacillus anthracis, Bacillus cereus), Bartonella sp. (e.g.Bartonella henselae, Bartonella quintana), Bordetella sp. (e.g.Bordetella pertussis), Borrelia sp. (e.g. Borrelia burgdorferi, Borreliagarinii, Borrelia afzelii, Borrelia recurrentis), Brucella sp. (e.g.Brucella abortus, Brucella canis, Brucella melitensis, Brucella suis),Campylobacter sp. (e.g. Campylobacter jejuni), Chlamydia sp. (e.g.Chlamydia pneumoniae, Chlamydia trachomatis), Chlamydophila sp. (e.g.Chlamydophila psittaci), Clostridium sp. (e.g. Clostridium botulinum,Clostridium difficile, Clostridium perfringens, Clostridium tetani),Corynebacterium sp. (e.g. Corynebacterium diphtheria), Enterococcus sp.(e.g. Enterococcus faecalis, Enterococcus faecium), Escherichia sp.(e.g. Escherichia coli), Francisella sp. (e.g. Francisella tularensis),Haemophilus sp. (e.g. Haemophilus influenzae), Helicobacter sp. (e.g.Helicobacter pylori), Legionella sp. (e.g. Legionella pneumophila),Leptospira sp. (e.g. Leptospira interrogans, Leptospira santarosai,Leptospira weilii, Leptospira noguchii), Listeria sp. (e.g. Listeriamonocytogenes), Mycobacterium sp. (e.g. Mycobacterium leprae,Mycobacterium tuberculosis, Mycobacterium ulcerans), Mycoplasma sp.(e.g. Mycoplasma pneumoniae), Neisseria sp. (e.g. Neisseria gonorrhoeae,Neisseria meningitidis), Pseudomonas sp. (e.g. Pseudomonas aeruginosa),Rickettsia sp. (e.g. Rickettsia rickettsia), Salmonella sp. (e.g.Salmonella typhi, Salmonella typhimurium), Shigella sp. (e.g. Shigellasonnei), Staphylococcus sp. (e.g. Staphylococcus aureus, Staphylococcusepidermidis, Staphylococcus saprophyticus), Streptococcus sp. (e.g.Streptococcus agalactiae, Streptococcus pneumoniae, Streptococcuspyogenes), Treponema sp. (e.g. Treponema pallidum), Ureaplasma sp. (e.g.Ureaplasma urealyticum), Vibrio sp. (e.g. Vibrio cholerae), or Yersiniasp. (e.g. Yersinia pestis, Yersinia enterocolitica, Yersiniapseudotuberculosis).

In certain embodiments, infection is a viral infection. In certainembodiments, the viral infection is infection due to Adenoviridae (e.g.Adenovirus), Herpesviridae (e.g. Herpes simplex, type 1, Herpes simplex,type 2, Varicella-zoster virus, Epstein-barr virus, Humancytomegalovirus, Human herpesvirus, type 8), Papillomaviridae (e.g.Human papillomavirus), Polyomaviridae (e.g. BK virus, JC virus),Poxviridae (e.g. Smallpox), Hepadnaviridae (e.g. Hepatitis B virus),Parvoviridae (e.g. Parvovirus B19), Astroviridae (e.g. Humanastrovirus), Caliciviridae (e.g. Norwalk virus), Picornaviridae (e.g.coxsackievirus, hepatitis A virus, poliovirus, rhinovirus),Coronaviridae (e.g. Severe acute respiratory syndrome virus),Flaviviridae (e.g. Hepatitis C virus, yellow fever virus, dengue virus,West Nile virus, TBE virus), Togaviridae (e.g. Rubella virus),Hepeviridae (e.g. Hepatitis E virus), Retroviridae (e.g. Humanimmunodeficiency virus (HIV)), Orthomyxoviridae (e.g. Influenza virus),Arenaviridae (e.g. Lassa virus), Bunyaviridae (e.g. Crimean-Congohemorrhagic fever virus, Hantaan virus), Filoviridae (e.g. Ebola virus,Marburg virus), Paramyxoviridae (e.g. Measles virus, Mumps virus,Parainfluenza virus, Respiratory syncytial virus), Rhabdoviridae (e.g.Rabies virus), Hepatitis D, or Reoviridae (e.g. Rotavirus, Orbivirus,Coltivirus, Banna virus.

In certain embodiments, infection is a protozoal or parasitic infection.In certain embodiments, the protozoan or parasitic infection isinfection due to Euglenozoa (e.g. Trypanosoma cruzi, Trypanosoma brucei,Leishmania spp.), Heterolobosea (e.g. Naegleria fowleri), Diplomonadida(e.g. Giardia intestinalis), Amoebozoa (e.g. Acanthamoeba castellanii,Balamuthia mandrillaris, Entamoeba histolytica), Blastocystis (e.g.Blastocystis hominis), Apicomplexa (e.g. Babesia microti,Cryptosporidium parvum, Cyclospora cayetanensis, Plasmodium spp.,Toxoplasma gondii), Roundworm infection (nematodiasis) (e.g. Filariasis(Wuchereria bancrofti, Brugia malayi infection), Onchocerciasis(Onchocerca volvulus infection), Soil-transmitted helminthiasisincluding ascariasis (Ascaris lumbricoides infection, trichuriasis(Trichuris infection), and hookworm infection (includes Necatoriasis andAncylostoma duodenale infection), Trichostrongyliasis (Trichostrongylusspp. infection), Dracunculiasis (guinea worm infection)); Tapeworminfection (cestodiasis) (e.g. Echinococcosis (Echinococcus infection),Hymenolepiasis (Hymenolepis infection), Taeniasis/cysticercosis (Taeniainfection), Coenurosis (T. multiceps, T. serialis, T. glomerata and T.brauni infection)); Trematode infection (trematodiasis) (e.g.Amphistomiasis (amphistomes infection), Clonorchiasis (Clonorchissinensis infection), Fascioliasis (Fasciola infection), Fasciolopsiasis(Fasciolopsis buski infection), Opisthorchiasis (Opisthorchisinfection), Paragonimiasis (Paragonimus infection),Schistosomiasis/bilharziasis (Schistosoma infection)); andAcanthocephala infection (e.g. Moniliformis infection).

In certain embodiments, infection is a fungal infection. In certainembodiments, the fungal infection is infection due to Candida species,such as C. albicans; Cryptococcus species, such as C. neoformans, C.gattii; Aspergillus species, such as A. fumigatus and A. flavus;Pneumocystis species, such as P. carinii; Coccidioides species such asC. immitis; Trichophyton species such as T. verrucosum; Blastomycesspecies such as B. dermatitidis; Histoplasma species such as H.capsulatum; Paracoccidioides species such as P. brasiliensis;Mucoromycotina sp.; Sporothrix sp, such as S. schenckii; and Pythiumspecies such as P. insidiosum.

In certain embodiments autoimmune diseases are selected fromMyocarditis, Postmyocardial infarction syndrome, Postpericardiotomysyndrome, Subacute bacterial endocarditis, Anti-Glomerular BasementMembrane nephritis, Interstitial cystitis, Lupus nephritis, Autoimmunehepatitis, Primary biliary cirrhosis, Primary sclerosing cholangitis,Antisynthetase syndrome, Alopecia Areata, Autoimmune Angioedema,Autoimmune progesterone dermatitis, Autoimmune urticaria, Bullouspemphigoid, Cicatricial pemphigoid, Dermatitis herpetiformis, Discoidlupus erythematosus, Epidermolysis bullosa acquisita, Erythema nodosum,Gestational pemphigoid, Hidradenitis suppurativa, Lichen planus, Lichensclerosus, Linear IgA disease, Morphea, Pemphigus vulgaris, Pityriasislichenoides et varioliformis acuta, Mucha-Habermann disease, Psoriasis,Systemic scleroderma, Vitiligo, Addison's disease, Autoimmunepolyendocrine syndrome, Autoimmune polyendocrine syndrome type 2,Autoimmune polyendocrine syndrome type 3, Autoimmune pancreatitis,Diabetes mellitus type 1, Autoimmune thyroiditis, Ord's thyroiditis,Graves' disease, Autoimmune Oophoritis, Endometriosis, Autoimmuneorchitis, Sjogren's syndrome, Autoimmune enteropathy, Celiac disease,Crohn's disease, Microscopic colitis, Ulcerative colitis,Antiphospholipid syndrome, Aplastic anemia, Autoimmune hemolytic anemia,Autoimmune lymphoproliferative syndrome, Autoimmune neutropenia,Autoimmune thrombocytopenic purpura, Cold agglutinin disease, Essentialmixed cryoglobulinemia, Evans syndrome, IgG4-related systemic disease,Paroxysmal nocturnal hemoglobinuria, Pernicious anemia, Pure red cellaplasia, Thrombocytopenia, Adiposis dolorosa, Adult-onset Still'sdisease, Ankylosing Spondylitis, CREST syndrome, Drug-induced lupus,Enthesitis-related arthritis, Eosinophilic fasciitis, Felty syndrome,Juvenile Arthritis, Lyme disease (Chronic), Mixed connective tissuedisease, Palindromic rheumatism, Parry Romberg syndrome,Parsonage-Turner syndrome, Psoriatic arthritis, Reactive arthritis,Relapsing polychondritis, Retroperitoneal fibrosis, Rheumatic fever,Rheumatoid arthritis, Sarcoidosis, Schnitzler syndrome, Systemic LupusErythematosus, Undifferentiated connective tissue disease,Dermatomyositis, Fibromyalgia, Inclusion body myositis, Myositis,Myasthenia gravis, Neuromyotonia, Paraneoplastic cerebellardegeneration, Polymyositis, Acute disseminated encephalomyelitis, Acutemotor axonal neuropathy, Anti-N-Methyl-D-Aspartate ReceptorEncephalitis, Balo concentric sclerosis, Bickerstaff s encephalitis,Chronic inflammatory demyelinating polyneuropathy, Guillain—Barrésyndrome, Hashimoto's encephalopathy, Idiopathic inflammatorydemyelinating diseases, Lambert-Eaton myasthenic syndrome, Multiplesclerosis, Pediatric Autoimmune Neuropsychiatric Disorder Associatedwith Streptococcus, Progressive inflammatory neuropathy, Restless legsyndrome, Stiff person syndrome, Sydenham chorea, Transverse myelitis,Autoimmune retinopathy, Autoimmune uveitis, Cogan syndrome, Gravesophthalmopathy, Intermediate uveitis, Ligneous conjunctivitis, Mooren'sulcer, Neuromyelitis optica, Opsoclonus myoclonus syndrome, Opticneuritis, Scleritis, Susac's syndrome, Sympathetic ophthalmia,Tolosa-Hunt syndrome, Autoimmune inner ear disease, Meniere's disease,Anti-neutrophil cytoplasmic antibody-associated vasculitis, Behcet'sdisease, Churg-Strauss syndrome, Giant cell arteritis , Henoch-Schonleinpurpura, Kawasaki's disease, Leukocytoclastic vasculitis, Lupusvasculitis, Rheumatoid vasculitis, Microscopic polyangiitis,Polyarteritis nodosa, Polymyalgia rheumatica, Urticarial vasculitis, andVasculitis.

In certain embodiments allergic diseases are selected from allergicrhinitis, drug allergy, latex allergy, insect sting/bite allergy,urticarial, contact dermatitis, allergic conjunctivitis, hay fever, foodallergies, atopic dermatitis, allergic asthma, and anaphylaxis.

In certain embodiments cancer is selected from carcinoma, sarcoma,lymphoma, leukemia, germ cell tumors, blastoma. In certain embodimentscancer is selected from Acute lymphoblastic leukemia (ALL); Acutemyeloid leukemia; Adrenocortical carcinoma; AIDS-related cancers;AIDS-related lymphoma; Anal cancer; Appendix cancer; Astrocytoma,childhood cerebellar or cerebral; Basal-cell carcinoma; Bile ductcancer, extrahepatic (see cholangiocarcinoma); Bladder cancer; Bonetumor, osteosarcoma/malignant fibrous histiocytoma; Brainstem glioma;Brain cancer; Brain tumor, cerebellar astrocytoma; Brain tumor, cerebralastrocytoma/malignant glioma; Brain tumor, ependymoma; Brain tumor,medulloblastoma; Brain tumor, supratentorial primitive neuroectodermaltumors; Brain tumor, visual pathway and hypothalamic glioma; Breastcancer; Bronchial adenomas/carcinoids; Burkitt's lymphoma; Carcinoidtumor, childhood; Carcinoid tumor, gastrointestinal; Carcinoma ofunknown primary; Central nervous system lymphoma, primary; Cerebellarastrocytoma, childhood; Cerebral astrocytoma/malignant glioma,childhood; Cervical cancer; Childhood cancers; Chondrosarcoma; Chroniclymphocytic leukemia; Chronic myelogenous leukemia; Chronicmyeloproliferative disorders; Colon cancer; Cutaneous T-cell lymphoma;Desmoplastic small round cell tumor; Endometrial cancer; Ependymoma;Esophageal cancer; Ewing's sarcoma in the Ewing family of tumors;Extracranial germ cell tumor, childhood; Extragonadal germ cell tumor;Extrahepatic bile duct cancer; Eye cancer, intraocular melanoma; Eyecancer, retinoblastoma; Gallbladder cancer; Gastric (stomach) cancer;Gastrointestinal carcinoid tumor; Gastrointestinal stromal tumor (GIST);Germ cell tumor: extracranial, extragonadal, or ovarian; Gestationaltrophoblastic tumor; Glioma of the brain stem; Glioma, childhoodcerebral astrocytoma; Glioma, childhood visual pathway and hypothalamic;Gastric carcinoid; Hairy cell leukemia; Head and neck cancer; Heartcancer; Hepatocellular (liver) cancer; Hodgkin lymphoma; Hypopharyngealcancer; Hypothalamic and visual pathway glioma, childhood; Intraocularmelanoma; Islet cell carcinoma (endocrine pancreas); Kaposi sarcoma;Kidney cancer (renal cell cancer); Laryngeal cancer; Leukaemias;Leukaemia, acute lymphoblastic (also called acute lymphocyticleukaemia); Leukaemia, acute myeloid (also called acute myelogenousleukemia); Leukaemia, chronic lymphocytic (also called chroniclymphocytic leukemia); Leukemia, chronic myelogenous (also calledchronic myeloid leukemia); Leukemia, hairy cell; Lip and oral cavitycancer; Liposarcoma; Liver cancer (primary); Lung cancer, non-smallcell; Lung cancer, small cell; Lymphomas; Lymphoma, AIDS-related;Lymphoma, Burkitt; Lymphoma, cutaneous T-Cell; Lymphoma, Hodgkin;Lymphomas, Non-Hodgkin (an old classification of all lymphomas exceptHodgkin's); Lymphoma, primary central nervous system; Macroglobulinemia,Waldenstrom; Male breast cancer; Malignant fibrous histiocytoma ofbone/osteosarcoma; Medulloblastoma, childhood; Melanoma; Melanoma,intraocular (eye); Merkel cell cancer; Mesothelioma, adult malignant;Mesothelioma, childhood; Metastatic squamous neck cancer with occultprimary; Mouth cancer; Multiple endocrine neoplasia syndrome, childhood;Multiple myeloma/plasma cell neoplasm; Mycosis fungoides;Myelodysplastic syndromes; Myelodysplastic/myeloproliferative diseases;Myelogenous leukemia, chronic; Myeloid leukemia, adult acute; Myeloidleukemia, childhood acute; Myeloma, multiple (cancer of thebone-marrow); Myeloproliferative disorders, chronic; Myxoma; Nasalcavity and paranasal sinus cancer; Nasopharyngeal carcinoma;Neuroblastoma; Non-Hodgkin lymphoma; Non-small cell lung cancer;Oligodendroglioma; Oral cancer; Oropharyngeal cancer;Osteosarcoma/malignant fibrous histiocytoma of bone; Ovarian cancer;Ovarian epithelial cancer (surface epithelial-stromal tumor); Ovariangerm cell tumor; Ovarian low malignant potential tumor; Pancreaticcancer; Pancreatic cancer, islet cell; Paranasal sinus and nasal cavitycancer; Parathyroid cancer; Penile cancer; Pharyngeal cancer;Pheochromocytoma; Pineal astrocytoma; Pineal germinoma; Pineoblastomaand supratentorial primitive neuroectodermal tumors, childhood;Pituitary adenoma; Plasma cell neoplasia/Multiple myeloma;Pleuropulmonary blastoma; Primary central nervous system lymphoma;Prostate cancer; Rectal cancer; Renal cell carcinoma (kidney cancer);Renal pelvis and ureter, transitional cell cancer; Retinoblastoma;Rhabdomyosarcoma, childhood; Salivary gland cancer; Sarcoma, Ewingfamily of tumors; Sarcoma, Kaposi; Sarcoma, soft tissue; Sarcoma,uterine; Sézary syndrome; Skin cancer (non-melanoma); Skin cancer(melanoma); Skin carcinoma, Merkel cell; Small cell lung cancer; Smallintestine cancer; Soft tissue sarcoma; Squamous cell carcinoma—see skincancer (non-melanoma); Squamous neck cancer with occult primary,metastatic; Stomach cancer; Supratentorial primitive neuroectodermaltumor, childhood; T-Cell lymphoma, cutaneous—see Mycosis Fungoides andSézary syndrome; Testicular cancer; Throat cancer; Thymoma, childhood;Thymoma and thymic carcinoma; Thyroid cancer; Thyroid cancer, childhood;Transitional cell cancer of the renal pelvis and ureter; Trophoblastictumor, gestational; Unknown primary site, carcinoma of, adult; Unknownprimary site, cancer of, childhood; Ureter and renal pelvis,transitional cell cancer; Urethral cancer; Uterine cancer, endometrial;Uterine sarcoma; Vaginal cancer; Visual pathway and hypothalamic glioma,childhood; Vulvar cancer; Waldenström macroglobulinemia; Wilms tumor(kidney cancer), childhood.

In certain embodiments, the infectious disease is a prion. Prions areinfectious pathogens that do not contain nucleic acids. These abnormallyfolded proteins are found characteristically in some diseases such asscrapie, bovine spongiform encephalopathy (mad cow disease) andCreutzfeldt—Jakob disease.

The cells as referred to herein according to the invention originatefrom an animal, including vertebrate and non-vertebrate animals,preferably vertebrate animals, such as without limitation includingmammalians, reptiles, fish, birds, amphibians, preferably mammalians,such as without limitation primates, rodents, carnivores, artiodactyla,lagomorpha, etc. the cells may be human or non-human. The cells may bederived for instance from human, mouse, rat, or rabbit.

Biological samples as used in the various methods or compositions asdiscussed herein in one embodiment comprise immune cells, developing orundifferentiated cells, or healthy cells. Such biological samples mayfor instance comprise lymphoid tissues, such as primary or secondarylymph tissues (e.g. lymph fluid, thymus, lymph nodes, spleen, bonemarrow, tonsils, Peyer's patches, mucosa associated lymphoid tissue(MALT), appendix) or blood. Alternatively, the biological sample may beany tissue sample comprising immune cells, developing orundifferentiated cells, or healthy cells.

According to certain aspects or embodiments of the invention,differential expression of protein or RNA is performed between samples,which may be differential expression of proteins or RNA based on singlecell analyses. Such single cell based analyses may be performed bytechniques as discussed herein (e.g., Drop-Seq).

In certain aspects and embodiments, the invention relates to modulants,their use, and methods for identifying modulants, as discussed herein.As used herein, the term modulant may be used interchangeably withmodulator. As used herein, an modulant refers preferably to a compound(or combination of compounds) which are capable of altering or affectingthe functioning of the cell system, or of particular components of thecell system, such as one or more particular cells or cell types. Amodulant may alter for instance the cell state or phenotype ofparticular immune cells or immune cell (sub)populations. A modulant mayfor instance increase or induce or alternatively decrease or ablateparticular immune cells or immune cell (sub)populations, such that theentire immune cell population obtains a different functionality, such asfor instance an improved immunological response towards a pathologicalconditions. Modulants include any potential class of biologically activeagent, such as for instance small molecules, drugs, TLR agonists,antagonists, genetic perturbations (e.g. knock-out, knock-down, or othertypes of (inactivating or otherwise modulating) mutations), etc. Methodsfor altering signatures, immune responder phenotypes, or immune cellssuch as immune cell (sub)populations may include contacting particularimmune cells (or populations) with modulants as discussed herein, whichmay be in vitro or in vivo. If performed in vitro, the so-treated cellsmay after treatment be administered to an individual in need thereof. Incertain embodiments, the modulant may be provided in pharmaceuticalcompositions, and may for instance be included in a vaccine. Typically,a vaccine further comprises an antigen, wherein said antigen ispreferably specific for a particular pathological condition and/or mayfurther comprise immune cells (e.g. antigen presenting cells, optionallyprimer with antigen).

The present invention also relates to compositions, such aspharmaceutical compositions, comprising the cells or cell(sub)populations as discussed herein, such as the cells or cell(sub)populations having particular signature as discussed herein, or thecells or cell (sub)populations associated with or characteristic ofparticular responder phenotypes as discussed herein.

As noted elsewhere, pharmaceutical compositions as taught hereincomprise one or more pharmaceutically acceptable excipient.

The term “pharmaceutically acceptable” as used herein is consistent withthe art and means compatible with the other ingredients of apharmaceutical composition and not deleterious to the recipient thereof.

As used herein, “carrier” or “excipient” includes any and all solvents,diluents, buffers (such as, e.g., neutral buffered saline or phosphatebuffered saline), solubilisers, colloids, dispersion media, vehicles,fillers, chelating agents (such as, e.g., EDTA or glutathione), aminoacids (such as, e.g., glycine), proteins, disintegrants, binders,lubricants, wetting agents, emulsifiers, sweeteners, colorants,flavourings, aromatisers, thickeners, agents for achieving a depoteffect, coatings, antifungal agents, preservatives, stabilisers,antioxidants, tonicity controlling agents, absorption delaying agents,and the like. The use of such media and agents for pharmaceutical activesubstances is well known in the art. Such materials should be non-toxicand should not interfere with the activity of the cells or activecomponents (e.g. modulators).

The precise nature of the carrier or excipient or other material willdepend on the route of administration. For example, the composition maybe in the form of a parenterally acceptable aqueous solution, which ispyrogen-free and has suitable pH, isotonicity and stability. For generalprinciples in medicinal formulation, the reader is referred to CellTherapy: Stem Cell Transplantation, Gene Therapy, and CellularImmunotherapy, by G. Morstyn & W. Sheridan eds., Cambridge UniversityPress, 1996; and Hematopoietic Stem Cell Therapy, E. D. Ball, J. Lister& P. Law, Churchill Livingstone, 2000.

The pharmaceutical composition can be applied parenterally, rectally,orally or topically. Preferably, the pharmaceutical composition may beused for intravenous, intramuscular, subcutaneous, peritoneal,peridural, rectal, nasal, pulmonary, mucosal, or oral application. In apreferred embodiment, the pharmaceutical composition according to theinvention is intended to be used as an infusion. The skilled person willunderstand that compositions comprising modulators as discussed hereinwhich are to be administered orally or topically will usually notcomprise cells, although it may be envisioned for oral compositions toalso comprise cells, for example when gastro-intestinal tractindications are treated. Each of the compounds as discussed herein (e.g.cells, modulators) may be administered by the same route or may beadministered by a different route. By means of example, and withoutlimitation, the cells may be administered parenterally and the modulatormay be administered orally.

Liquid pharmaceutical compositions may generally include a liquidcarrier such as water or a pharmaceutically acceptable aqueous solution.For example, physiological saline solution, tissue or cell culturemedia, dextrose or other saccharide solution or glycols such as ethyleneglycol, propylene glycol or polyethylene glycol may be included.

The composition may include one or more cell protective molecules, cellregenerative molecules, growth factors, anti-apoptotic factors orfactors that regulate gene expression in the cells. Such substances mayrender the cells independent of its environment.

Such pharmaceutical compositions may contain further components ensuringthe viability of the cells therein. For example, the compositions maycomprise a suitable buffer system (e.g., phosphate or carbonate buffersystem) to achieve desirable pH, more usually near neutral pH, and maycomprise sufficient salt to ensure isosmotic conditions for the cells toprevent osmotic stress. For example, suitable solution for thesepurposes may be phosphate-buffered saline (PBS), sodium chloridesolution, Ringer's Injection or Lactated Ringer's Injection, as known inthe art. Further, the composition may comprise a carrier protein, e.g.,albumin (e.g., bovine or human albumin), which may increase theviability of the cells.

Further suitably pharmaceutically acceptable carriers or additives arewell known to those skilled in the art and for instance may be selectedfrom proteins such as collagen or gelatine, carbohydrates such asstarch, polysaccharides, sugars (dextrose, glucose and sucrose),cellulose derivatives like sodium or calcium carboxymethylcellulose,hydroxypropyl cellulose or hydroxypropylmethyl cellulose, pregelatinizedstarches, pectin agar, carrageenan, clays, hydrophilic gums (acacia gum,guar gum, arabic gum and xanthan gum), alginic acid, alginates,hyaluronic acid, polyglycolic and polylactic acid, dextran, pectins,synthetic polymers such as water-soluble acrylic polymer orpolyvinylpyrrolidone, proteoglycans, calcium phosphate and the like.

If desired, cell preparation can be administered on a support, scaffold,matrix or material to provide improved tissue regeneration. For example,the material can be a granular ceramic, or a biopolymer such asgelatine, collagen, or fibrinogen. Porous matrices can be synthesizedaccording to standard techniques (e.g., Mikos et al., Biomaterials 14:323, 1993; Mikos et al., Polymer 35:1068, 1994; Cook et al., J. Biomed.Mater. Res. 35:513, 1997). Such support, scaffold, matrix or materialmay be biodegradable or non-biodegradable. Hence, the cells may betransferred to and/or cultured on suitable substrate, such as porous ornon-porous substrate, to provide for implants. For example, cells thathave proliferated, or that are being differentiated in culture dishes,can be transferred onto three-dimensional solid supports in order tocause them to multiply and/or continue the differentiation process byincubating the solid support in a liquid nutrient medium of theinvention, if necessary. Cells can be transferred onto athree-dimensional solid support, e.g. by impregnating said support witha liquid suspension containing said cells. The impregnated supportsobtained in this way can be implanted in a human subject. Suchimpregnated supports can also be re-cultured by immersing them in aliquid culture medium, prior to being finally implanted. Thethree-dimensional solid support needs to be biocompatible so as toenable it to be implanted in a human. It may be biodegradable ornon-biodegradable.

The cells or cell (sub)populations can be administered in a manner thatpermits them to survive, grow, propagate and/or differentiate towardsdesired cell types (e.g. differentiation) or cell states. The cells orcell (sub)populations may be grafted to or may migrate to and engraftwithin the intended organ, such as, e.g., liver. Engraftment of thecells or cell (sub)populations in other places, tissues or organs suchas liver, spleen, pancreas, kidney capsule, peritoneum or omentum may beenvisaged.

In an embodiment the pharmaceutical cell preparation as defined abovemay be administered in a form of liquid composition. In embodiments, thecells or pharmaceutical composition comprising such can be administeredsystemically, topically, within an organ or at a site of organdysfunction or lesion.

Preferably, the pharmaceutical compositions may comprise atherapeutically effective amount of the desired cells. The term“therapeutically effective amount” refers to an amount which can elicita biological or medicinal response in a tissue, system, animal or humanthat is being sought by a researcher, veterinarian, medical doctor orother clinician, and in particular can prevent or alleviate one or moreof the local or systemic symptoms or features of a disease or conditionbeing treated.

In certain embodiments, the invention involves compositions as discussedherein, such as the pharmaceutical compositions as discussed herein.

Platforms for Profiling Single Cells

Cells may be considered processing units. Core constituents include skincells, such as fibroblasts, adipocytes and epithelial cells; immunecells such as megakaryocytes, dendritic cells and T cells: brain cellssuch as neurons, ependymal cells and astrocytes and muscle cells such assmooth muscle, skeletal muscle and cardiac muscle. However, groupedcells are not identical (see, e.g., M D Slack et al. PNAS 105, (2008)and P. Dalerba et al. Nat. Biotech. 29, (2011)) and differences canunderlie unique behaviors (see, e.g., A A Cohen et al. Science, 322(2008), S Tay et al. Nature 466, (2010) and O. Feinerman et al. Mol.Sys. Bio. 437, (2010)).

Four fundamental questions may be asked: 1. How do cells respond tochange? 2. Which differences influence those responses? 3. When isvariability mitigated and when is it leveraged? 4. How doesheterogeneity affect the interactions between functionally differentcells? To address these questions, one must be able to precisely measureand manipulate large numbers of single mammalian cells in parallel.

For developing new platforms, Applicants leverage nano- andmicro-fabrication to develop new platforms for thoroughly andcontrollably profiling single cells.

Conventional methods for studying single cells include opticalmicroscopy and flow/mass cytometry. With optical microscopy, temporaland spatial information may be obtained, however there is limited depthdue to spectral overlap between fluorophores. With flow/mass cytometry(see, e.g., S C Bendell et al. Science, 332 (2011), one can profile manyobservables (such as 6-16 for flow cytometry and 34+for mass cytometry)and it is easy to achieve a large number of statistics. However, nospatial information may be obtained and it is extremely difficult tofollow the same cell overtime.

Intercellular signaling is a compounding variable. Controlling acellular microenvironment may be done by any device or microdevice—suchas reverse emulsions, microwells and microfluidics—that can define andmaintain a constant, known the extracellular milieu.

The present invention also involves utilizing nano- and micro-technologyto complement current single cell studies. Using microstructures,Applicants can restrain single cells or their components atconcentrations similar to 0.01-100 million cells/mL.

Traditional population measurements involve lots of starting materialand enable deep profiling of a particular molecular species. Forinstance, normally, upon activation, a naïve T cell will differentiateinto one of several, functionally distinct T helper cell sub-types,depending upon the cytokine environment—some of these promote immuneresponses, whereas others ‘shut down’ immune activity. Each individual Thelper also secretes lineage-dependent cytokines that can influence itsown actions, as well as those of its peers. The correct balance betweenthe sub-types is critical to normal immune function, and defects in theguiding molecular circuits can lead to autoimmune disorders, such asMultiple Sclerosis and Psoriasis, and allergies. Studying cellulardecision making in populations of T cells on a single cell basis willreveal fundamental principles underlying cellular diversity, aid indeveloping diagnostics and therapeutic strategies for immune disorders,and provide a paradigm for similar studies in other mammalian cells.

To decouple single T cells from themselves and their peers, Applicantsfabricate ordered “capture” sites for restraining cells during rapidsolution perfusion. This enables examination of how cellular responsesdepend upon component heterogeneity in the absence of compoundingexternal factors. By systematically controlling the levels of differentcytokines, antigens, peptides, secreted factors, ligands, ions, DNA, orRNA in the perfusion media, Applicants also investigate howextracellular signaling affects intracellular circuits. Importantly,Applicants integrate optical and microfluidic controls so that opticalinterrogation and transcriptome analysis can be performed on the samecell. HIV

In one embodiment, the present invention relates to innate immunityagainst a pathogen HIV-1 and immune cells, such as functional cDCmaturation, at the single-cell level analyzed by for example, RNAseq.Applicants identified a highly functional cDC subset present in EC thatmight provide critical information for future clinical approaches toinduce highly-effective T cell responses in a larger number of patientsand individuals.

To further unravel complex immune system heterogeneity, in particular toestablish cellular networks and cell interactions, aiming at improvingdiagnostic or therapeutic efforts, Applicants profiled cDCs from an ECat rest or exposed to pseudo-typed HIV-1 virus using single-cell RNA-Seq(scRNA-Seq), a recently developed approach that enables unbiasedidentification of the cell types, states, circuits, and moleculardrivers normally convolved in a complex ensemble behavior (Shalek et al.Nature 2013; 2014; Patel et al. Science 2014; Satija and Shalek Trendsin Immunology 2014). Applicants applied scRNA-Seq to identify a highlyfunctional subpopulation of elite controller dendritic cells respondingto viral infection. Applicants developed and validated a computationalinfrastructure for identifying intracellular nodes and selectingimmunomodulators that preferentially rebalance immune subsetcomposition. Using identified immunomodulators, such as TLR3 ligands,such as poly I:C, Applicants induced a larger population of thefunctional subset of dendritic cells in normal donors, such ascharacterized by (surface) expression of CD64 and PD-L1, or high(surface) expression of CD64 and PD-L1, or increased (surface)expression of CD64 and PD-L1 (e.g. as compared to dendritic cells notbelonging to the functional subset).

Unexpectedly, virus exposed cDCs separated into three distinct responsegroups, characterized by substantial differences in transcriptionalprograms associated with cellular activation and antiviral response.These clusters—influenced but not defined by viral interactions—revealeda new, highly functional group of cDCs, characterized by strongexpression of innate immune activation genes and phenotypicallydistinguished by high surface expression of CD64 and PD-Ll. Importantly,this group of cDCs has superior antigen presentation and T cellactivation capabilities in vitro, and, although preferentially enrichedin ECs, is common to all individuals. Using a combination ofcomputational and experimental approaches to rationally uncover and testputative immunomodulators that can alter the relative abundances ofthese cDC groups, Applicants show that one can selectively adjuvant orinhibit this cDC subgroup in healthy individuals and ECs, respectively.With respect to EC, the study reveals functional heterogeneity in EC cDCresponses and identifies transcriptional signatures associated withimmune control of HIV-1 in cDCs; more generally, it demonstrates thatimmune system composition informs function and details new methodologiesfor identifying and rationally rebalancing salient components torealize, therapeutically or prophylactically, a desired response.

It has been established that these findings are not per se and onlyrelevant for or characteristic of cDCs in control of HIV infection, butinstead can be extrapolated to different types of immune cells,different types of pathological conditions, and/or different types ofphenotypes or phenotypical behaviours, both at single cell levels aswell as cell population level.

B Cells/Monocytes

An objective of the B cell study was to identify patterns ofdifferential expression (DE) between an elite controller and anon-neutralizer in cells tetramer-sorted for HIV protein (gp140) andnon-sorted samples.

Expression study steps at the population level include a differentialexpression analysis which involves projection (all genes) and a directcomparison (gene-by-gene). Expression study steps at the cell levelinvolves clustering and DE-analysis.

Populations

One embodiment of the present invention is aimed at finding DE genesassociated with neutralizing antibody (NA) breadth. NA breadth has beendefined as an integer in the phenotyping, and this integer has changedsince the original phenotyping.

One embodiment of the invention involves identifying signatures of HIV-1specific broadly neutralizing antibody (bnAb) ontogeny using a systemsbiology approach, specifically identification of leukocellularsubset-specific signatures defining HIV-1 controllers with and withoutHIV-1-specific neutralizing breadth, identification of signatures duringpre-infection or early infection that prospectively predict thedevelopment of neutralizing breadth during the ensuing disease processand generation of a comprehensive multidimensional database integratingall clinical, transcriptional, epigenetic and functional immunologicdata.

Single-Cell RNAseq Analysis

One problem with RNA-seq analysis of bulk cell populations is that cellheterogeneity may confound results. Applicants therefore have started touse scRNA-seq assays for further investigation of dendritic cells. Ininitial experiments LPS-stimulated bone marrow derived dendritic cells(BMDCs) were examined. While the gene expression levels of populationreplicates were tightly correlated with one another, there weresubstantial differences in gene expression between individual cells.

When Applicants compared dendritic cells with and without exposure toHIV on the population level, Applicants identified 26 genes that aredifferentially expressed between those two conditions. However, cellsexposed to HIV could be classified in three different groups: Two groupsthat were grossly different from gene expression patterns fromHIV-unexposed cells, and one group that appear to have transcriptionalsignatures similar to HIV-1 unexposed cells. This technology showssingle-cell RNA-seq of DC is technically possible, and allows todelineate distinct classes of DC with altered gene expression patterns.

To elaborate on this point, Applicants used this methodological approachto better understand gene expression patterns associated with thedevelopment of neutralizing breadth. Systems-level responses in ourbodies represent the combined and coordinated behaviors of a highlydiverse ensemble of cells. In the immune system, many specialized cellsmust work together to defend against myriad pathogenic threats, maintainlong-term memory, and establish tolerance (Germain 2012). Moreover, theinterplay between these cells must establish checks and balances toprotect against autoimmunity or immunodeficiency (Littman and Rudensky2010, Yosef, Shalek et al. 2013). Measuring these phenomena in bulk,however, blends and potentially masks the unique contributions ofindividual cells, particularly when their behaviors are heterogeneous ordriven by rare cell types/states.

To overcome this issue, to date, analyses of immune cells have primarilyrelied on first dividing the system into distinct subpopulations fromthe “top-down,” typically based on the expression of cellular markers,and subsequently characterizing each bin independently. This strategyhas cataloged the major cell types of the mammalian immune system,established more nuanced functional divisions (Shay and Kang 2013), anduncovered that balanced composition is essential for proper function.Illustratively, overproduction of a subset of T helper cells(pro-inflammatory Th17) (Yosef, Shalek et al. 2013), or an imbalance inthe relative proportions of DC subtypes(Nakahara, Uchi et al. 2010), canlead to autoimmune disease; similarly, in cancer, the density anddiversity of tumor-infiltrating lymphocytes (TILs) has been shown to bepredictive of tumor recurrence and clinical outcome (Galon, Costes etal. 2006). Yet, while informative, these “top-down” approaches depend onpre-selected markers, biasing experimental design. Moreover, recentmolecular studies have shown that even “identical” cells cansubstantially differ in gene expression, protein levels and phenotypicoutput (Cohen, Geva-Zatorsky et al. 2008, Raj and Van Oudenaarden 2009,Feinerman, Jentsch et al. 2010, Sharma, Lee et al. 2010, Bendall,Simonds et al. 2011, Dalerba, Kalisky et al. 2011), with importantfunctional consequences (Cohen, Geva-Zatorsky et al. 2008, Feinerman,Jentsch et al. 2010, Sharma, Lee et al. 2010), highlighting theshortcomings of “top-down” schemes.

A complementary approach is to examine a system from the “bottom-up,”profiling its component cells individually. Until recently,single-cell-based approaches, such as fluorescence activated cellsorting (FACS) or immunofluorescence, had been technically limited toprobing a few pre-selected RNAs or proteins (Cohen, Geva-Zatorsky et al.2008, Raj and Van Oudenaarden 2009, Sharma, Lee et al. 2010, Bendall,Simonds et al. 2011, Dalerba, Kalisky et al. 2011), hindering one'sability to uncover novel factors. The recent emergence of single cellgenomic approaches, and especially single-cell RNA-Seq (scRNA-seq),opens a new path for unbiased molecular profiling of individual immunecells from which Applicants can identify cell states and theirassociated signatures. For example, in Applicant's own work, usingscRNA-Seq of 18 ‘identical’ dendritic cells (DCs) exposed tolipopolysaccharide (LPS), Applicants discovered extensive bimodality inthe DC response at multiple levels, including in the expression of keyimmune response genes and the splicing of RNA, which Applicantsindependently validated by RNA-FISH of 25 selected transcripts. Byexamining the co-variation between different genes across just 18 singlecells, Applicants were able to decipher two distinct cell states and aninterferon-driven antiviral circuit that Applicants subsequentlyvalidated in murine knockout models. Applicants then developed ahigh-throughput workflow for profiling many individual cells acrossdifferent experimental conditions and used it to prepare scRNA-Seqlibraries from over 1,800 BMDCs stimulated with three pathogeniccomponents (Shalek, Satija et al. 2014). Here, Applicants identified arare (-1%) sub-population of precocious responders that expresses a coremodule of antiviral genes very early; this same module becomes active inall cells at later time points. By stimulating cells individually insealed microfluidic chambers and analyzing DCs from knockout mice,Applicants showed that these precocious cells propagate and coordinatethis response through interferon-mediated paracrine signaling.Surprisingly, the precocious cells are also essential for suppressing anearly-induced inflammatory gene module. Taken together, these findingsdemonstrate the power and promise of single cell genomics, and highlightthe importance of cell type/state, the microenvironment andinter-cellular communication in establishing and coordinating complexdynamic responses at the ensemble/system level.

Determining Genetic Interactions

In one aspect, the invention provides a method for determining geneticinteractions. This method involves causing a set of P geneticperturbations in cells, wherein the method may comprise: determining,based upon random sampling, a subset of 7C genetic perturbations fromthe set of P genetic perturbations; performing said subset of 7C geneticperturbations in a population of cells; performing single-cell molecularprofiling of the population of genetically perturbed cells of step;inferring single-cell molecular profiles for the set of P geneticperturbations in cells.

The perturbation methods as discussed herein may also be used tovalidate particular gene signatures or specific modulators, such as forinstance to identify relevant genes or pathways which are involved toobtain a particular gene signature, responder phenotype, or particularcell (sub)population, or cell state.

The population of cells with a plurality of genomic sequence orperturbation conditions involves a plurality of cells and perturbationsto be tested and measurements sampled to obtain meaningful data and toinfer appropriate circuits. The number of genes perturbed, and how manyare perturbed simultaneously (the order of the perturbation, pairs,triplets, etc.) varies. In a tissue with n cell types, the rarestpresent in m%, how many cells X do you need to sequence so that you haveat least Y of the rarest subtype.

For example, ˜500 cells ensures ≥95% chance of including ≥10 of eachtype, based on the following calculation. Assume the most conservativescenario that of M cell subtypes (for example, 12), all but one havingthe lowest predicted proportion (for example, p_(min)=5%). Assuming thatthe Central Limit Theorem holds (a reasonable assumption when solving todetect at least 10 cells of each type) the number of cells of each typei, termed T_(i), will distribute as E[T_(i)]=N*p_(min),STDV[T_(i)]=√(N*p_(min)*(1−p_(min))). The minimal N (total number ofcells to profile) can be solved such that all (m-1) subtypes have atleast n cells (the last, majority, subtype easily clears this thresholdsince its proportion is much higher). Applicants confirmed withsimulation that the strategy conservatively holds in practice even forn<10, and take a margin of additional (conservative) error, to allow forsubsequent failed RNA-Seq experiments (<20-30%, depending on protocol).

Modelling Genetic Interactions

The method of the invention may be used for determining geneticinteractions, including modelling and/or analyzing such interactions.Such genetic interactions form part of cellular circuitry, in that theinteractions reflect connections of components within one or morecellular pathways. Such pathways may be intracellular pathways orintercellular pathways.

In some embodiments, the method of the invention may further comprisedetermining genetic interactions.

In some embodiments, the method of the invention may further compriseconfirming genetic interactions with additional genetic manipulations.

The method may further comprise a validation step, wherein additionalmanipulations are performed in order to confirm previously identifiedgenetic interactions. Such validation step may include in vivo or invitro experiments, such as gene inactivation, gene deletion, geneactivation or overexpression, and combinations thereof. Such geneticmanipulations may be performed with any genetic tool available in theart, comprising but not limited to RNAi, CRISPR-Cas based gene editing,nucleic acid transfection, etc.

Genetic Perturbations

In one aspect, said set of P genetic perturbations or said subset of 7Cgenetic perturbations may comprise single-order genetic perturbations.Within the meaning of the present invention, single-order geneticperturbation means that a given cell undergoes a single geneticperturbation (one perturbation per cell).

In one aspect, said set of P genetic perturbations or said subset of 7Cgenetic perturbations may comprise combinatorial genetic perturbations.Within the meaning of the present invention, combinatorial orhigher-order genetic perturbation means that a given cell undergoes acombination of k single-order genetic perturbations (k perturbations percell), with k>1. In some embodiments, k is an integer ranging from 2 to15. In some embodiments, k=2, 3, 4, 5, 6, 7, 8, 9 or 10.

Within the meaning of the present invention, said genetic perturbationmay comprise gene knock-down (gene repression or gene inactivation),gene knock-out (gene deletion), gene activation, gene insertion, orregulatory element modulation (deletion or mutation).

Combinations of different types of genetic perturbations are alsoenvisioned within the meaning of the present invention. For example, acombination of genetic perturbations may comprise a knock-down for afirst gene, combined to an activation of a second gene, etc.

In one aspect, said set of P genetic perturbations or said subset of 7Cgenetic perturbations may comprise genome-wide perturbations.Genome-wide perturbations are genetic perturbations that affect lociacross the entire genome. Genome-wide perturbation may include singleperturbationsof >100, >200, >500, >1,000, >2,500, >5,000, >10,000, >15,000 or >20,000single genomic loci. The present invention encompasses k-ordercombinations of genome-wide perturbation.

In some embodiments, the method may comprise determining k-order geneticinteractions.

In some embodiments, said set of P genetic perturbations may comprisecombinatorial genetic perturbations, such as k-order combinations ofsingle-order genetic perturbations, wherein k is an integer ranging from2 to 15, and step (e) may comprise determining j-order geneticinteractions, with j <k. Such embodiments rely on sampling higher-orderinteractions in order to more efficiently infer lower order ones. Givena limited number of possible assays, one is more powered to determinelower order interactions (e.g., 2-, 3-way) from measuring higher orderinteractions (e.g., 5-way) than from allotting all assays to the lowerorder, because any higher order interaction carries some informationabout all interaction terms up to that order (e.g., in compressedsensing, it informs in convolved form on additional Fouriercoefficients). Thus, even if most interactions are low order (2- or3-way) these embodiments are more powered to detect them.

CRISPR-Cas Systems

In some embodiments, RNAi- or CRISPR-Cas-based perturbation may beperformed. Said perturbation may be performed (e.g. “delivered”) in anarray-format or pool-format. Some embodiments may comprise pooled singleor combinatorial CRISPR-Cas-based perturbation with a genome-widelibrary of sgRNAs, wherein each sgRNA comprises a unique molecularidentifier. In some embodiments, a step may comprise pooledcombinatorial CRISPR-Cas-based perturbation with a genome-wide libraryof sgRNAs, wherein each sgRNA comprises a unique molecular identifierand is co-delivered with a reporter mRNA.

CRISPR-Cas systems, including CRISPR-Cas9 systems, as used herein, referto non-naturally occurring systems derived from bacterial ClusteredRegularly Interspaced Short Palindromic Repeats loci. These systemsgenerally comprise an enzyme (Cas protein, such as Cas9 protein) and oneor more RNAs. Said RNA is a CRISPR RNA and may be an sgRNA. Said RNAand/or said enzyme may be engineered, for example for optimal use inmammalian cells, for optimal delivery therein, for optimal activitytherein, for specific uses in gene editing, etc.

sgRNA refers to a CRISPR single-guide RNA. This RNA is a component of aCRISPR-Cas system. The sequence of the sgRNA determines the targetsequence for gene editing, knock-down, knock-out, insertion, etc. Forgenome-wide approaches, it is possible to design and construct suitablesgRNA libraries. Such sgRNAs may be delivered to cells using vectordelivery such as viral vector delivery. Combination ofCRISPR-Cas-mediated perturbations may be obtained by delivering multiplesgRNAs within a single cell. This may be achieved in pooled format. Inthe case of sgRNA viral vector delivery, combined perturbation may beobtained by delivering several sgRNA vectors to the same cell. This mayalso be achieved in pooled format, and number of combined perturbationsin a cell then corresponds to the MOI (multiplicity of infection). UsingCRISPR-Cas systems, one may generally implement MOI values of up to 10,12 or 15.

The CRISPR-Cas system may be implemented in order to cause massivelycombinatorial molecular perturbations (MCMP), including single-order andcombinatorial genome-wide genetic perturbations.

CRISPR-Cas-based gene editing allows to perform pooled genome-scalescreens with expression readouts in primary cells (A Genome-wide CRISPRScreen in Primary Immune Cells to Dissect Regulatory Networks. ParnasO., Jovanovic M., Eisenhaure T M., Herbst R H., Dixit A., Ye C J.,Przybylski D., Platt R J., Tirosh I., Sanjana N E., Shalem O., SatijaR., Raychowdhury R., Mertins P., Carr S A., Zhang F., Hacohen N., RegevA. A Genome-wide CRISPR Screen in Primary Immune Cells to DissectRegulatory Networks. Cell July 15. (2015) 2015 Jul. 30; 162(3):675-86.doi: 10.1016/j.cell.2015.06.059. Epub 2015 Jul. 16).

In some embodiments, the present invention involves combinatorialperturbations by way of CRISPR-Cas (such as CRISPR-Cas9) assays. Inaccordance with the present invention, sampling a far-from-exhaustivenumber of higher order perturbations, when coupled with complex genomicreadouts, may suffice to resolve most non-linear relations. Accordingly,in some aspects, the present invention relies on pooled, combinatorialperturbations with genomic readout into Massively CombinatorialPerturbation Profiling (MCPP).

In some embodiments, the method of the invention may comprise one ormore CRISPR-Cas-based assays. Such CRISPR-Cas assays are advantageousfor implementing a precise perturbation of genes and their expressionlevels.

In some embodiments, CRISPR-Cas systems may be used to knockoutprotein-coding genes by frameshifts (indels). Embodiments includeefficient and specific CRISPR-Cas9 mediated knockout (Gilbert, L. A.,Horlbeck, M. A., Adamson, B., Villalta, J. E., Chen, Y., Whitehead, E.H., Guimaraes, C., Panning, B., Ploegh, H. L., Bassik, M. C., Qi, L. S.,Kampmann, M. & Weissman, J. S. Genome-Scale CRISPR-Mediated Control ofGene Repression and Activation. Cell. 159, 647-661,doi:10.1016/j.cell.2014.09.029 (2014). PMCID:4253859; Ran, F. A., Cong,L., Yan, W. X., Scott, D. A., Gootenberg, J. S., Kriz, A. J., Zetsche,B., Shalem, O., Wu, X., Makarova, K. S., Koonin, E. V., Sharp, P. A. &Zhang, F. In vivo genome editing using Staphylococcus aureus Cas9.Nature. 520, 186-191, doi:10.1038/nature14299 (2015). PMCID:4393360),including a CRISPR mediated double-nicking to efficiently modify bothalleles of a target gene or multiple target loci (Ran, F. A., Hsu, P.D., Lin, C. Y., Gootenberg, J. S., Konermann, S., Trevino, A. E., Scott,D. A., Inoue, A., Matoba, S., Zhang, Y. & Zhang, F. Double nicking byRNA-guided CRISPR Cas9 for enhanced genome editing specificity. Cell.154, 1380-1389, doi:10.1016/j.cell.2013.08.021 (2013). PMCID:3856256;Wang, H., Yang, H., Shivalila, C. S., Dawlaty, M. M., Cheng, A. W.,Zhang, F. & Jaenisch, R. One-step generation of mice carrying mutationsin multiple genes by CRISPR-Cas-mediated genome engineering. Cell. 153,910-918, doi:10.1016/j.cell.2013.04.025 (2013). PMCID:3969854) andimplementation of a smaller Cas9 protein for delivery on smaller vectors(Ran, F. A., Cong, L., Yan, W. X., Scott, D. A., Gootenberg, J. S.,Kriz, A. J., Zetsche, B., Shalem, O., Wu, X., Makarova, K. S., Koonin,E. V., Sharp, P. A. & Zhang, F. In vivo genome editing usingStaphylococcus aureus Cas9. Nature. 520, 186-191,doi:10.1038/nature14299 (2015). PMCID:4393360).

CRISPR-mediated activation or inactivation (CRISPRa/i) systems may beused to activate or inactivate gene transcription. Briefly, anuclease-dead (deactivated) Cas9 RNA-guided DNA binding domain (dCas9)(Qi, L. S., Larson, M. H., Gilbert, L. A., Doudna, J. A., Weissman, J.S., Arkin, A. P. & Lim, W. A. Repurposing CRISPR as an RNA-guidedplatform for sequence-specific control of gene expression. Cell. 152,1173-1183, doi:10.1016/j.cell.2013.02.022 (2013). PMCID:3664290)tethered to transcriptional repressor domains that promote epigeneticsilencing (e.g., KRAB) forms a “CRISPRi” (Gilbert, L. A., Larson, M. H.,Morsut, L., Liu, Z., Brar, G. A., Tones, S. E., Stern-Ginossar, N.,Brandman, O., Whitehead, E. H., Doudna, J. A., Lim, W. A., Weissman, J.S. & Qi, L. S. CRISPR-mediated modular RNA-guided regulation oftranscription in eukaryotes. Cell. 154, 442-451,doi:10.1016/j.cell.2013.06.044 (2013). PMCID:3770145; Konermann, S.,Brigham, M. D., Trevino, A. E., Hsu, P. D., Heidenreich, M., Cong, L.,Platt, R. J., Scott, D. A., Church, G. M. & Zhang, F. Optical control ofmammalian endogenous transcription and epigenetic states. Nature. 500,472-476, doi:10.1038/nature12466 (2013). PMCID:3856241) that repressestranscription. To use dCas9 as an activator (CRISPRa), a guide RNA maybe engineered to carry RNA binding motifs (e.g., MS2) that recruiteffector domains fused to RNA-motif binding proteins, increasingtranscription (Konermann, S., Brigham, M. D., Trevino, A. E., Joung, J.,Abudayyeh, O. O., Barcena, C., Hsu, P. D., Habib, N., Gootenberg, J. S.,Nishimasu, H., Nureki, O. & Zhang, F. Genome-scale transcriptionalactivation by an engineered CRISPR-Cas9 complex. Nature. 517, 583-588,doi:10.1038/nature14136 (2015). PMCID:4420636).

CRISPR-Cas systems may also be used for the deletion of regulatoryelements. To target non-coding elements, pairs of guides may be designedand used to delete regions of a defined size, and tile deletionscovering sets of regions in pools. The delivery of two sgRNAs maymediate efficient excision of 500 bp genomic fragments.

CRISPR-Cas systems may also be used for gene editing, e.g. byRNA-templated homologous recombination. Keskin, H., Shen, Y., Huang, F.,Patel, M., Yang, T., Ashley, K., Mazin, A. V. & Storici, F.Transcript-RNA-templated DNA recombination and repair. Nature. 515,436-439, doi:10.1038/nature13682 (2014).

CRISPR transgenic mice may be used to derive ‘CRISPR-ready’ cells.‘CRISPR-mice’ are mice where the mouse germ line is engineered to harborkey elements of a CRISPR system, and cells require only the programmable(sgRNA) element to activate the CRISPR-Cas system. CRISPR mice includeCas9-transgenic mice (Platt, R. J., Chen, S., Zhou, Y., Yim, M. J.,Swiech, L., Kempton, H. R., Dahlman, J. E., Parnas, O., Eisenhaure, T.M., Jovanovic, M., Graham, D. B., Jhunjhunwala, S., Heidenreich, M.,Xavier, R. J., Langer, R., Anderson, D. G., Hacohen, N., Regev, A.,Feng, G., Sharp, P. A. & Zhang, F. CRISPR-Cas9 knockin mice for genomeediting and cancer modeling. Cell. 159, 440-455,doi:10.1016/j.cell.2014.09.014 (2014). PMCID:4265475; Parnas O.,Jovanovic M., Eisenhaure T M., Herbst R H., Dixit A., Ye C J.,Przybylski D., Platt R J., Tirosh I., Sanjana N E., Shalem O., SatijaR., Raychowdhury R., Mertins P., Carr S A., Zhang F., Hacohen N., RegevA. A Genome-wide CRISPR Screen in Primary Immune Cells to DissectRegulatory Networks. Cell July 15. (2015) 2015 Jul. 30; 162(3):675-86.doi: 10.1016/j.cell.2015.06.059. Epub 2015 Jul. 16).

CRISPR-Cas based perturbations, including single order or higher orderperturbations, may be implemented in pooled format. The perturbation(screen) may be performed with expression readouts or reporterexpression readout (genome-wide reporter-based pooled screens).

CRISPR-Cas functional genomics assays that may be used to cause sets ofgenetic perturbations are discussed in Shalem O., Sanjana N E., Zhang F.High-throughput functional genomics using CRISPR-Cas9. Nat Rev Genet.May; 16(5):299-311. (2015). doi: 10.1038/nrg3899. Epub 2015 Apr. 9.

sgRNA libraries, including genome-wide libraries, may be designed asdiscussed in Parnas O., Jovanovic M., Eisenhaure T M., Herbst R H.,Dixit A., Ye C J., Przybylski D., Platt R J., Tirosh I., Sanjana N E.,Shalem O., Satija R., Raychowdhury R., Mertins P., Carr S A., Zhang F.,Hacohen N., Regev A. A Genome-wide CRISPR Screen in Primary Immune Cellsto Dissect Regulatory Networks. Cell July 15. (2015) 2015 Jul. 30;162(3):675-86. doi: 10.1016/j.cell.2015.06.059. Epub 2015 Jul. 16;Sanjana, N. E., Shalem, O. & Zhang, F. Improved vectors and genome-widelibraries for CRISPR screening. Nat Methods. 11, 783-784,doi:10.1038/nmeth.3047 (2014); Shalem, O., Sanjana, N. E., Hartenian,E., Shi, X., Scott, D. A., Mikkelsen, T. S., Heckl, D., Ebert, B. L.,Root, D. E., Doench, J. G. & Zhang, F. Genome-scale CRISPR-Cas9 knockoutscreening in human cells. Science. 343, 84-87,doi:10.1126/science.1247005 (2014). PMCID:4089965; Shalem, O., Sanjana,N. E. & Zhang, F. High-throughput functional genomics using CRISPR-Cas9.Nat Rev Genet. 16, 299-311, doi:10.1038/nrg3899 (2015).

A pooled genome-wide screen for CRISPR-mediated KO (knock-out) may beperformed as in Shalem, O., Sanjana, N. E., Hartenian, E., Shi, X.,Scott, D. A., Mikkelsen, T. S., Heckl, D., Ebert, B. L., Root, D. E.,Doench, J. G. & Zhang, F. Genome-scale CRISPR-Cas9 knockout screening inhuman cells. Science. 343, 84-87, doi:10.1126/science.1247005 (2014).PMCID:4089965.

An expression marker-based genome-wide CRISPR screen may be performed asin Parnas O., Jovanovic M., Eisenhaure T M., Herbst R H., Dixit A., Ye CJ., Przybylski D., Platt R J., Tirosh I., Sanjana N E., Shalem O.,Satija R., Raychowdhury R., Mertins P., Carr S A., Zhang F., Hacohen N.,Regev A. A Genome-wide CRISPR Screen in Primary Immune Cells to DissectRegulatory Networks. Cell July 15. (2015) 2015 Jul. 30; 162(3):675-86.doi: 10.1016/j.cell.2015.06.059. Epub 2015 Jul. 16.

A pooled, genome-scale, CRISPRa screen may be performed as in Konermann,S., Brigham, M. D., Trevino, A. E., Joung, J., Abudayyeh, O. O.,Barcena, C., Hsu, P. D., Habib, N., Gootenberg, J. S., Nishimasu, H.,Nureki, O. & Zhang, F. Genome-scale transcriptional activation by anengineered CRISPR-Cas9 complex. Nature. 517, 583-588,doi:10.1038/nature14136 (2015). PMCID:4420636.

Pooled combinatorial perturbations may be performed, where the deliveredperturbations and impact (molecular profiling) are determined post hoc,in either a conventional readout (e.g., sorting followed by sequencing)or with high-content single cell genomics.

In some embodiment, the CRISPR-Cas screen is performed by co-deliveringmultiple sgRNA using viral vector delivery (eg, sgRNA encoding vectorsat a relatively high MOI) into cells pre-expressing the Cas9 enzyme toobtain as many higher order combinations as possible. For small sets of—5 genes one may generate a combinatorially complete ascertained set ofall 32 perturbations.

To detect which perturbations were co-delivered in pooled bins, severalstrategies are envisioned: (1) Combining two or more guide barcodesusing in situ PCR in PEG hydrogel that restricts the diffusion of doublestranded DNA; (2) Split-pool tagging of guide barcodes in hydrogels,such that only guides from the same cell are tagged with the samesequence; (3) FISH of expressed guides for imaging readouts. In eachcase it is possible to use use an error-correction scheme in thebarcodes.

To detect which perturbations were co-delivered with a single cellgenomics readout, it is possible to report the (combinatorial)perturbation in a manner compatible with the full genomic readout. Forexample one may use an sgRNA vector that also highly expresses asynthetic polyadenylated RNA reporter of the sgRNA barcode. This RNAwill be captured along with the cellular mRNA in the transcriptomeprofiling, eg scRNA-seq (Drop-Seq, see below), or reported by FISHhybridization, such that the same assay ascertains the sgRNAs and theirimpact on expression (Parnas O., Jovanovic M., Eisenhaure T M., Herbst RH., Dixit A., Ye C J., Przybylski D., Platt R J., Tirosh I., SanjanaNE., Shalem O., Satija R., Raychowdhury R., Mertins P., Carr S A., ZhangF., Hacohen N., Regev A. A Genome-wide CRISPR Screen in Primary ImmuneCells to Dissect Regulatory Networks. Cell Jul. 15. (2015) 2015 Jul. 30;162(3):675-86. doi: 10.1016/j.cell.2015.06.059. Epub 2015 Jul. 16).

With respect to general information on CRISPR-Cas Systems, componentsthereof, and delivery of such components, including methods, materials,delivery vehicles, vectors, particles, AAV, and making and usingthereof, including as to amounts and formulations, all useful in thepractice of the instant invention, reference is made to: U.S. Pat. Nos.8,999,641, 8,993,233, 8,945,839, 8,932,814, 8,906,616, 8,895,308,8,889,418, 8,889,356, 8,871,445, 8,865,406, 8,795,965, 8,771,945 and8,697,359; US Patent Publications US 2014-0310830 (US APP. Ser. No.14/105,031), US 2014-0287938 A1 (U.S. App. Ser. No. 14/213,991), US2014-0273234 A1 (U.S. application Ser. No. 14/293,674), US2014-0273232A1 (U.S. application Ser. No. 14/290,575), US 2014-0273231 (U.S.application Ser. No. 14/259,420), US 2014-0256046 A1 (U.S. applicationSer. No. 14/226,274), US 2014-0248702 A1 (U.S. application Ser. No.14/258,458), US 2014-0242700 A1 (U.S. application Ser. No. 14/222,930),US 2014-0242699 A1 (U.S. application Ser. No. 14/183,512), US2014-0242664 A1 (U.S. application Ser. No. 14/104,990), US 2014-0234972A1 (U.S. application Ser. No. 14/183,471), US 2014-0227787 A1 (U.S.application Ser. No. 14/256,912), US 2014-0189896 A1 (U.S. applicationSer. No. 14/105,035), US 2014-0186958 (U.S. application Ser. No.14/105,017), US 2014-0186919 A1 (U.S. application Ser. No. 14/104,977),US 2014-0186843 A1 (U.S. application Ser. No. 14/104,900), US2014-0179770 A1 (U.S. application Ser. No. 14/104,837) and US2014-0179006 A1 (U.S. application Ser. No. 14/183,486), US 2014-0170753(U.S. application Ser. No 14/183,429); European Patents EP 2 784 162 B1and EP 2 771 468 B1; European Patent Applications EP 2 771 468(EP13818570.7), EP 2 764 103 (EP13824232.6), and EP 2 784 162(EP14170383.5); and PCT Patent Publications PCT Patent Publications WO2014/093661 (PCT/US2013/074743), WO 2014/093694 (PCT/US2013/074790), WO2014/093595 (PCT/U52013/074611), WO 2014/093718 (PCT/US2013/074825), WO2014/093709 (PCT/US2013/074812), WO 2014/093622 (PCT/US2013/074667), WO2014/093635 (PCT/US2013/074691), WO 2014/093655 (PCT/US2013/074736), WO2014/093712 (PCT/US2013/074819), W02014/093701 (PCT/U52013/074800),W02014/018423 (PCT/U52013/051418), WO 2014/204723 (PCT/U52014/041790),WO 2014/204724 (PCT/U52014/041800), WO 2014/204725 (PCT/U52014/041803),WO 2014/204726 (PCT/U52014/041804), WO 2014/204727 (PCT/U52014/041806),WO 2014/204728 (PCT/U52014/041808), WO 2014/204729 (PCT/US2014/041809).Reference is also made to U.S. provisional patent applications61/758,468; 61/802,174; 61/806,375; 61/814,263; 61/819,803 and61/828,130, filed on Jan. 30, 2013; Mar. 15, 2013; Mar. 28, 2013; Apr.20, 2013; May 6, 2013 and May 28, 2013 respectively. Reference is alsomade to U.S. provisional patent application 61/836,123, filed on Jun.17, 2013. Reference is additionally made to U.S. provisional patentapplications 61/835,931, 61/835,936, 61/836,127, 61/836, 101, 61/836,080and 61/835,973, each filed Jun. 17, 2013. Further reference is made toU.S. provisional patent applications 61/862,468 and 61/862,355 filed onAug. 5, 2013; 61/871,301 filed on Aug. 28, 2013; 61/960,777 filed onSep. 25, 2013 and 61/961,980 filed on Oct. 28, 2013. Reference is yetfurther made to: PCT Patent applications Nos: PCT/US2014/041803,PCT/US2014/041800, PCT/US2014/041809, PCT/US2014/041804 andPCT/US2014/041806, each filed Jun. 10, 2014 Jun. 10, 2014;PCT/US2014/041808 filed Jun. 11, 2014; and PCT/US2014/62558 filed Oct.28, 2014, and U.S. Provisional Patent Applications Ser. Nos.:61/915,150, 61/915,301, 61/915,267 and 61/915,260, each filed Dec. 12,2013; 61/757,972 and 61/768,959, filed on Jan. 29, 2013 and Feb. 25,2013; 61/835,936, 61/836,127, 61/836,101, 61/836,080, 61/835,973, and61/835,931, filed Jun. 17, 2013; 62/010,888 and 62/010,879, both filedJun. 11, 2014; 62/010,329 and 62/010,441, each filed Jun. 10, 2014;61/939,228 and 61/939,242, each filed Feb. 12, 2014; 61/980,012, filedApr. 15, 2014; 62/038,358, filed Aug. 17, 2014; 62/054,490, 62/055,484,62/055,460 and 62/055,487, each filed Sep. 25, 2014; and 62/069,243,filed Oct. 27, 2014. Reference is also made to U.S. provisional patentapplications Nos. 62/055,484, 62/055,460, and 62/055,487, filed Sep. 25,2014; U.S. provisional patent application 61/980,012, filed Apr. 15,2014; and U.S. provisional patent application 61/939,242 filed Feb. 12,2014. Reference is made to PCT application designating, inter alia, theUnited States, application No. PCT/US14/41806, filed Jun. 10, 2014.Reference is made to U.S. provisional patent application 61/930,214filed on Jan. 22, 2014. Reference is made to U.S. provisional patentapplications 61/915,251; 61/915,260 and 61/915,267, each filed on Dec.12, 2013. Reference is made to U.S. provisional patent application USSN61/980,012 filed Apr. 15, 2014. Reference is made to PCT applicationdesignating, inter alia, the United States, application No.PCT/US14/41806, filed Jun. 10, 2014. Reference is made to U.S.provisional patent application 61/930,214 filed on Jan. 22, 2014.Reference is made to U.S. provisional patent applications 61/915,251;61/915,260 and 61/915,267, each filed on Dec. 12, 2013.

Mention is also made of U.S. application 62/091,455, filed, 12 Dec. 14,PROTECTED GUIDE RNAS (PGRNAS); U.S. application 62/096,708, 24 Dec. 14,PROTECTED GUIDE RNAS (PGRNAS); U.S. application 62/091,462, 12 Dec. 14,DEAD GUIDES FOR CRISPR TRANSCRIPTION FACTORS; U.S. application62/096,324, 23 Dec. 14, DEAD GUIDES FOR CRISPR TRANSCRIPTION FACTORS;U.S. application 62/091,456, 12 Dec. 14, ESCORTED AND FUNCTIONALIZEDGUIDES FOR CRISPR-CAS SYSTEMS; U.S. application 62/091,461, 12 Dec. 14,DELIVERY, USE AND THERAPEUTIC APPLICATIONS OF THE CRISPR-CAS SYSTEMS ANDCOMPOSITIONS FOR GENOME EDITING AS TO HEMATOPOIETIC STEM CELLS (HSCs);U.S. application 62/094,903, 19 Dec. 14, UNBIASED IDENTIFICATION OFDOUBLE-STRAND BREAKS AND GENOMIC REARRANGEMENT BY GENOME-WISE INSERTCAPTURE SEQUENCING; U.S. application 62/096,761, 24 Dec. 14, ENGINEERINGOF SYSTEMS, METHODS AND OPTIMIZED ENZYME AND GUIDE SCAFFOLDS FORSEQUENCE MANIPULATION; U.S. application 62/098,059, 30 Dec. 14,RNA-TARGETING SYSTEM; U.S. application 62/096,656, 24 Dec. 14, CRISPRHAVING OR ASSOCIATED WITH DESTABILIZATION DOMAINS; U.S. application62/096,697, 24 Dec. 14, CRISPR HAVING OR ASSOCIATED WITH AAV; U.S.application 62/098,158, 30 Dec 14, ENGINEERED CRISPR COMPLEX INSERTIONALTARGETING SYSTEMS; U.S. application 62/151,052, 22 Apr. 15, CELLULARTARGETING FOR EXTRACELLULAR EXOSOMAL REPORTING; U.S. application62/054,490, 24 Sep. 14, DELIVERY, USE AND THERAPEUTIC APPLICATIONS OFTHE CRISPR-CAS SYSTEMS AND COMPOSITIONS FOR TARGETING DISORDERS ANDDISEASES USING PARTICLE DELIVERY COMPONENTS; U.S. application62/055,484, 25 Sep. 14, SYSTEMS, METHODS AND COMPOSITIONS FOR SEQUENCEMANIPULATION WITH OPTIMIZED FUNCTIONAL CRISPR-CAS SYSTEMS; U.S.application 62/087,537, 4 Dec. 14, SYSTEMS, METHODS AND COMPOSITIONS FORSEQUENCE MANIPULATION WITH OPTIMIZED FUNCTIONAL CRISPR-CAS SYSTEMS; U.S.application 62/054,651, 24 Sep. 14, DELIVERY, USE AND THERAPEUTICAPPLICATIONS OF THE CRISPR-CAS SYSTEMS AND COMPOSITIONS FOR MODELINGCOMPETITION OF MULTIPLE CANCER MUTATIONS IN VIVO; U.S. application62/067,886, 23 Oct. 14, DELIVERY, USE AND THERAPEUTIC APPLICATIONS OFTHE CRISPR-CAS SYSTEMS AND COMPOSITIONS FOR MODELING COMPETITION OFMULTIPLE CANCER MUTATIONS IN VIVO; U.S. application 62/054,675, 24 Sep.14, DELIVERY, USE AND THERAPEUTIC APPLICATIONS OF THE CRISPR-CAS SYSTEMSAND COMPOSITIONS IN NEURONAL CELLS/TISSUES; U.S. application 62/054,528,24 Sep. 14, DELIVERY, USE AND THERAPEUTIC APPLICATIONS OF THE CRISPR-CASSYSTEMS AND COMPOSITIONS IN IMMUNE DISEASES OR DISORDERS; U.S.application 62/055,454, 25 Sep. 14, DELIVERY, USE AND THERAPEUTICAPPLICATIONS OF THE CRISPR-CAS SYSTEMS AND COMPOSITIONS FOR TARGETINGDISORDERS AND DISEASES USING CELL PENETRATION PEPTIDES (CPP); U.S.application 62/055,460, 25 Sep. 14, MULTIFUNCTIONAL-CRISPR COMPLEXESAND/OR OPTIMIZED ENZYME LINKED FUNCTIONAL-CRISPR COMPLEXES; U.S.application 62/087,475, 4 Dec. 14, FUNCTIONAL SCREENING WITH OPTIMIZEDFUNCTIONAL CRISPR-CAS SYSTEMS; U.S. application 62/055,487, 25 Sep. 14,FUNCTIONAL SCREENING WITH OPTIMIZED FUNCTIONAL CRISPR-CAS SYSTEMS; U.S.application 62/087,546, 4 Dec. 14, MULTIFUNCTIONAL CRISPR COMPLEXESAND/OR OPTIMIZED ENZYME LINKED FUNCTIONAL-CRISPR COMPLEXES; and U.S.application 62/098,285, 30 Dec. 14, CRISPR MEDIATED IN VIVO MODELING ANDGENETIC SCREENING OF TUMOR GROWTH AND METASTASIS.

Each of these patents, patent publications, and applications, and alldocuments cited therein or during their prosecution (“appin citeddocuments”) and all documents cited or referenced in the appin citeddocuments, together with any instructions, descriptions, productspecifications, and product sheets for any products mentioned therein orin any document therein and incorporated by reference herein, are herebyincorporated herein by reference, and may be employed in the practice ofthe invention. All documents (e.g., these patents, patent publicationsand applications and the appin cited documents) are incorporated hereinby reference to the same extent as if each individual document wasspecifically and individually indicated to be incorporated by reference.

Also with respect to general information on CRISPR-Cas Systems, mentionis made of the following (also hereby incorporated herein by reference):

Multiplex genome engineering using CRISPR-Cas systems. Cong, L., Ran, F.A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P. D., Wu, X.,Jiang, W., Marraffini, L. A., & Zhang, F. Science February 15;339(6121):819-23 (2013);

RNA-guided editing of bacterial genomes using CRISPR-Cas systems. JiangW., Bikard D., Cox D., Zhang F, Marraffini L A. Nat Biotechnol March;31(3):233-9 (2013);

One-Step Generation of Mice Carrying Mutations in Multiple Genes byCRISPR-Cas-Mediated Genome Engineering. Wang H., Yang H., Shivalila CS., Dawlaty M M., Cheng A W., Zhang F., Jaenisch R. Cell May 9;153(4):910-8 (2013);

Optical control of mammalian endogenous transcription and epigeneticstates. Konermann S, Brigham M D, Trevino A E, Hsu P D, Heidenreich M,Cong L, Platt R J, Scott D A, Church G M, Zhang F. Nature. Aug. 22;500(7463):472-6. doi: 10.1038/Nature12466. Epub 2013 Aug. 23 (2013);

Double Nicking by RNA-Guided CRISPR Cas9 for Enhanced Genome EditingSpecificity. Ran, F A., Hsu, P D., Lin, C Y., Gootenberg, J S.,Konermann, S., Trevino, A E., Scott, D A., Inoue, A., Matoba, S., Zhang,Y., & Zhang, F. Cell August 28. pii: S0092-8674(13)01015-5 (2013-A);

DNA targeting specificity of RNA-guided Cas9 nucleases. Hsu, P., Scott,D., Weinstein, J., Ran, F A., Konermann, S., Agarwala, V., Li, Y., Fine,E., Wu, X., Shalem, O., Cradick, T J., Marraffini, L A., Bao, G., &Zhang, F. Nat Biotechnol doi:10.1038/nbt.2647 (2013);

Genome engineering using the CRISPR-Cas9 system. Ran, F A., Hsu, P D.,Wright, J., Agarwala, V., Scott, D A., Zhang, F. Nature ProtocolsNovember; 8(11):2281-308 (2013-B);

Genome-Scale CRISPR-Cas9 Knockout Screening in Human Cells. Shalem, O.,Sanjana, N E., Hartenian, E., Shi, X., Scott, D A., Mikkelson, T.,Heckl, D., Ebert, B L., Root, D E., Doench, J G., Zhang, F. ScienceDecember 12. (2013). [Epub ahead of print];

Crystal structure of cas9 in complex with guide RNA and target DNA.Nishimasu, H., Ran, F A., Hsu, P D., Konermann, S., Shehata, S I.,Dohmae, N., Ishitani, R., Zhang, F., Nureki, O. Cell February 27,156(5):935-49 (2014);

Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells.Wu X., Scott D A., Kriz A J., Chiu A C., Hsu P D., Dadon D B., Cheng AW., Trevino A E., Konermann S., Chen S., Jaenisch R., Zhang F., Sharp PA. Nat Biotechnol. April 20. doi: 10.1038/nbt.2889 (2014);

CRISPR-Cas9 Knockin Mice for Genome Editing and Cancer Modeling. Platt RJ, Chen S, Zhou Y, Yim M J, Swiech L, Kempton H R, Dahlman J E, ParnasO, Eisenhaure T M, Jovanovic M, Graham D B, Jhunjhunwala S, HeidenreichM, Xavier R J, Langer R, Anderson D G, Hacohen N, Regev A, Feng G, SharpP A, Zhang F. Cell 159(2): 440-455 DOI:10.1016/j.cell.2014.09.014(2014);

Development and Applications of CRISPR-Cas9 for Genome Engineering, HsuPD, Lander ES, Zhang F., Cell. June 5; 157(6):1262-78 (2014);

Genetic screens in human cells using the CRISPR-Cas9 system, Wang T, WeiJ J, Sabatini D M, Lander E S., Science. January 3; 343(6166): 80-84.doi:10.1126/science.1246981 (2014);

Rational design of highly active sgRNAs for CRISPR-Cas9-mediated geneinactivation, Doench J G, Hartenian E, Graham D B, Tothova Z, Hegde M,Smith I, Sullender M, Ebert B L, Xavier R J, Root D E., (publishedonline 3 Sep. 2014) Nat Biotechnol. December; 32(12):1262-7 (2014);

In vivo interrogation of gene function in the mammalian brain usingCRISPR-Cas9, Swiech L, Heidenreich M, Banerjee A, Habib N, Li Y,Trombetta J, Sur M, Zhang F., (published online 19 Oct. 2014) NatBiotechnol. January; 33(1):102-6 (2015);

Genome-scale transcriptional activation by an engineered CRISPR-Cas9complex, Konermann S, Brigham M D, Trevino A E, Joung J, Abudayyeh O O,Barcena C, Hsu P D, Habib N, Gootenberg J S, Nishimasu H, Nureki O,Zhang F., Nature. January 29; 517(7536):583-8 (2015);

A split-Cas9 architecture for inducible genome editing and transcriptionmodulation, Zetsche B, Volz S E, Zhang F., (published online 2 Feb.2015) Nat Biotechnol. February; 33(2):139-42 (2015);

Genome-wide CRISPR Screen in a Mouse Model of Tumor Growth andMetastasis, Chen S, Sanjana N E, Zheng K, Shalem O, Lee K, Shi X, ScottD A, Song J, Pan J Q, Weissleder R, Lee H, Zhang F, Sharp P A. Cell 160,1246-1260, Mar. 12, 2015 (multiplex screen in mouse), and

In vivo genome editing using Staphylococcus aureus Cas9, Ran F A, CongL, Yan W X, Scott D A, Gootenberg J S, Kriz A J, Zetsche B, Shalem O, WuX, Makarova K S, Koonin E V, Sharp P A, Zhang F., (published online 1Apr. 2015), Nature. April 9; 520(7546):186-91 (2015).

Each of which is incorporated herein by reference, may be considered inthe practice of the instant invention, and discussed briefly below:

Cong et al. engineered type II CRISPR-Cas systems for use in eukaryoticcells based on both Streptococcus thermophilus Cas9 and alsoStreptococcus pyogenes Cas9 and demonstrated that Cas9 nucleases can bedirected by short RNAs to induce precise cleavage of DNA in human andmouse cells. Their study further showed that Cas9 as converted into anicking enzyme can be used to facilitate homology-directed repair ineukaryotic cells with minimal mutagenic activity. Additionally, theirstudy demonstrated that multiple guide sequences can be encoded into asingle CRISPR array to enable simultaneous editing of several atendogenous genomic loci sites within the mammalian genome, demonstratingeasy programmability and wide applicability of the RNA-guided nucleasetechnology. This ability to use RNA to program sequence specific DNAcleavage in cells defined a new class of genome engineering tools. Thesestudies further showed that other CRISPR loci are likely to betransplantable into mammalian cells and can also mediate mammaliangenome cleavage. Importantly, it can be envisaged that several aspectsof the CRISPR-Cas system can be further improved to increase itsefficiency and versatility.

Jiang et al. used the clustered, regularly interspaced, shortpalindromic repeats (CRISPR)—associated Cas9 endonuclease complexed withdual-RNAs to introduce precise mutations in the genomes of Streptococcuspneumoniae and Escherichia coli. The approach relied ondual-RNA:Cas9-directed cleavage at the targeted genomic site to killunmutated cells and circumvents the need for selectable markers orcounter-selection systems. The study reported reprogrammingdual-RNA:Cas9 specificity by changing the sequence of short CRISPR RNA(crRNA) to make single- and multinucleotide changes carried on editingtemplates. The study showed that simultaneous use of two crRNAs enabledmultiplex mutagenesis. Furthermore, when the approach was used incombination with recombineering, in S. pneumoniae, nearly 100% of cellsthat were recovered using the discussed approach contained the desiredmutation, and in E. coli, 65% that were recovered contained themutation.

Wang et al. (2013) used the CRISPR-Cas system for the one-stepgeneration of mice carrying mutations in multiple genes which weretraditionally generated in multiple steps by sequential recombination inembryonic stem cells and/or time-consuming intercrossing of mice with asingle mutation. The CRISPR-Cas system will greatly accelerate the invivo study of functionally redundant genes and of epistatic geneinteractions.

Konermann et al. (2013) addressed the need in the art for versatile androbust technologies that enable optical and chemical modulation ofDNA-binding domains based CRISPR Cas9 enzyme and also TranscriptionalActivator Like Effectors.

Ran et al. (2013-A) described an approach that combined a Cas9 nickasemutant with paired guide RNAs to introduce targeted double-strandbreaks. This addresses the issue of the Cas9 nuclease from the microbialCRISPR-Cas system being targeted to specific genomic loci by a guidesequence, which can tolerate certain mismatches to the DNA target andthereby promote undesired off-target mutagenesis. Because individualnicks in the genome are repaired with high fidelity, simultaneousnicking via appropriately offset guide RNAs is required fordouble-stranded breaks and extends the number of specifically recognizedbases for target cleavage. The authors demonstrated that using pairednicking can reduce off-target activity by 50- to 1,500-fold in celllines and to facilitate gene knockout in mouse zygotes withoutsacrificing on-target cleavage efficiency. This versatile strategyenables a wide variety of genome editing applications that require highspecificity.

Hsu et al. (2013) characterized SpCas9 targeting specificity in humancells to inform the selection of target sites and avoid off-targeteffects. The study evaluated >700 guide RNA variants and SpCas9-inducedindel mutation levels at >100 predicted genomic off-target loci in 293Tand 293FT cells. The authors that SpCas9 tolerates mismatches betweenguide RNA and target DNA at different positions in a sequence-dependentmanner, sensitive to the number, position and distribution ofmismatches. The authors further showed that SpCas9-mediated cleavage isunaffected by DNA methylation and that the dosage of SpCas9 and sgRNAcan be titrated to minimize off-target modification. Additionally, tofacilitate mammalian genome engineering applications, the authorsreported providing a web-based software tool to guide the selection andvalidation of target sequences as well as off-target analyses.

Ran et al. (2013-B) described a set of tools for Cas9-mediated genomeediting via non-homologous end joining (NHEJ) or homology-directedrepair (HDR) in mammalian cells, as well as generation of modified celllines for downstream functional studies. To minimize off-targetcleavage, the authors further described a double-nicking strategy usingthe Cas9 nickase mutant with paired guide RNAs. The protocol provided bythe authors experimentally derived guidelines for the selection oftarget sites, evaluation of cleavage efficiency and analysis ofoff-target activity. The studies showed that beginning with targetdesign, gene modifications can be achieved within as little as 1-2weeks, and modified clonal cell lines can be derived within 2-3 weeks.

Shalem et al. described a new way to interrogate gene function on agenome-wide scale. Their studies showed that delivery of a genome-scaleCRISPR-Cas9 knockout (GeCKO) library targeted 18,080 genes with 64,751unique guide sequences enabled both negative and positive selectionscreening in human cells. First, the authors showed use of the GeCKOlibrary to identify genes essential for cell viability in cancer andpluripotent stem cells. Next, in a melanoma model, the authors screenedfor genes whose loss is involved in resistance to vemurafenib, atherapeutic that inhibits mutant protein kinase BRAF. Their studiesshowed that the highest-ranking candidates included previously validatedgenes NF1 and MED12 as well as novel hits NF2, CUL3, TADA2B, and TADA1.The authors observed a high level of consistency between independentguide RNAs targeting the same gene and a high rate of hit confirmation,and thus demonstrated the promise of genome-scale screening with Cas9.

Nishimasu et al. reported the crystal structure of Streptococcuspyogenes Cas9 in complex with sgRNA and its target DNA at 2.5 A°resolution. The structure revealed a bilobed architecture composed oftarget recognition and nuclease lobes, accommodating the sgRNA:DNAheteroduplex in a positively charged groove at their interface. Whereasthe recognition lobe is essential for binding sgRNA and DNA, thenuclease lobe contains the HNH and RuvC nuclease domains, which areproperly positioned for cleavage of the complementary andnon-complementary strands of the target DNA, respectively. The nucleaselobe also contains a carboxyl-terminal domain responsible for theinteraction with the protospacer adjacent motif (PAM). Thishigh-resolution structure and accompanying functional analyses haverevealed the molecular mechanism of RNA-guided DNA targeting by Cas9,thus paving the way for the rational design of new, versatilegenome-editing technologies.

Wu et al. mapped genome-wide binding sites of a catalytically inactiveCas9 (dCas9) from Streptococcus pyogenes loaded with single guide RNAs(sgRNAs) in mouse embryonic stem cells (mESCs). The authors showed thateach of the four sgRNAs tested targets dCas9 to between tens andthousands of genomic sites, frequently characterized by a 5-nucleotideseed region in the sgRNA and an NGG protospacer adjacent motif (PAM).Chromatin inaccessibility decreases dCas9 binding to other sites withmatching seed sequences; thus 70% of off-target sites are associatedwith genes. The authors showed that targeted sequencing of 295 dCas9binding sites in mESCs transfected with catalytically active Cas9identified only one site mutated above background levels. The authorsproposed a two-state model for Cas9 binding and cleavage, in which aseed match triggers binding but extensive pairing with target DNA isrequired for cleavage.

Platt et al. established a Cre-dependent Cas9 knockin mouse. The authorsdemonstrated in vivo as well as ex vivo genome editing usingadeno-associated virus (AAV)-, lentivirus-, or particle-mediateddelivery of guide RNA in neurons, immune cells, and endothelial cells.

Hsu et al. (2014) is a review article that discusses generallyCRISPR-Cas9 history from yogurt to genome editing, including geneticscreening of cells.

Wang et al. (2014) relates to a pooled, loss-of-function geneticscreening approach suitable for both positive and negative selectionthat uses a genome-scale lentiviral single guide RNA (sgRNA) library.

Doench et al. created a pool of sgRNAs, tiling across all possibletarget sites of a panel of six endogenous mouse and three endogenoushuman genes and quantitatively assessed their ability to produce nullalleles of their target gene by antibody staining and flow cytometry.The authors showed that optimization of the PAM improved activity andalso provided an on-line tool for designing sgRNAs.

Swiech et al. demonstrate that AAV-mediated SpCas9 genome editing canenable reverse genetic studies of gene function in the brain.

Konermann et al. (2015) discusses the ability to attach multipleeffector domains, e.g., transcriptional activator, functional andepigenomic regulators at appropriate positions on the guide such as stemor tetraloop with and without linkers.

Zetsche et al. demonstrates that the Cas9 enzyme can be split into twoand hence the assembly of Cas9 for activation can be controlled.

Chen et al. relates to multiplex screening by demonstrating that agenome-wide in vivo CRISPR-Cas9 screen in mice reveals genes regulatinglung metastasis.

Ran et al. (2015) relates to SaCas9 and its ability to edit genomes anddemonstrates that one cannot extrapolate from biochemical assays.

Also, “Dimeric CRISPR RNA-guided Fokl nucleases for highly specificgenome editing”, Shengdar Q. Tsai, Nicolas Wyvekens, Cyd Khayter,Jennifer A. Foden, Vishal Thapar, Deepak Reyon, Mathew J. Goodwin,Martin J. Aryee, J. Keith Joung Nature Biotechnology 32(6): 569-77(2014), relates to dimeric RNA-guided Fokl Nucleases that recognizeextended sequences and can edit endogenous genes with high efficienciesin human cells.

Useful in the practice of the instant invention, reference is made tothe article entitled BCL11A enhancer dissection by Cas9-mediated in situsaturating mutagenesis. Canver, M. C., Smith, E. C., Sher, F., Pinello,L., Sanjana, N. E., Shalem, O., Chen, D. D., Schupp, P. G., Vinjamur, D.S., Garcia, S. P., Luc, S., Kurita, R., Nakamura, Y., Fujiwara, Y.,Maeda, T., Yuan, G., Zhang, F., Orkin, S. H., & Bauer, D. E.DOI:10.1038/nature15521, published online Sep. 16, 2015, the article isherein incorporated by reference and discussed briefly below:

Canver et al. describes novel pooled CRISPR-Cas9 guide RNA libraries toperform in situ saturating mutagenesis of the human and mouse BCL11Aerythroid enhancers previously identified as an enhancer associated withfetal hemoglobin (HbF) level and whose mouse ortholog is necessary forerythroid BCL11A expression. This approach revealed critical minimalfeatures and discrete vulnerabilities of these enhancers. Throughediting of primary human progenitors and mouse transgenesis, the authorsvalidated the BCL11A erythroid enhancer as a target for HbF reinduction.The authors generated a detailed enhancer map that informs therapeuticgenome editing.

Reference is made to Zetsche et al., “Cpf1 Is a Single RNA-GuidedEndonuclease of a Class 2 CRISPR-Cas System,” Cell 163, 1-13 (Oct. 22,2015) and Shmakov et al., “Discovery and Functional Characterization ofDiverse Class 2 CRISPR-Cas Systems,” Molecular Cell 60, 1-13 (Availableonline Oct. 22, 2015). Zetsche et al. (2015) reported thecharacterization of Cpf1, a putative class 2 CRISPR effector. It wasdemonstrated that Cpfl mediates robust DNA interference with featuresdistinct from Cas9. Identifying this mechanism of interference broadensour understanding of CRISPR-Cas systems and advances their genomeediting applications. Shmakov et al. (2015) reported thecharacterization of three distinct Class 2 CRISPR-Cas systems. Theeffectors of two of the identified systems, C2c1 and C2c3, contain RuvClike endonuclease domains distantly related to Cpfl. The third system,C2c2, contains an effector with two predicted HEPN RNase domains.Mention is also made of “Dimeric CRISPR RNA-guided Fokl nucleases forhighly specific genome editing”, Shengdar Q. Tsai, Nicolas Wyvekens, CydKhayter, Jennifer A. Foden, Vishal Thapar, Deepak Reyon, Mathew J.Goodwin, Martin J. Aryee, J. Keith Joung Nature Biotechnology 32(6):569-77 (2014), which relates to dimeric RNA-guided FokI Nucleases thatrecognize extended sequences and can edit endogenous genes with highefficiencies in human cells. In addition, mention is made of PCTapplication PCT/US14/70057, Attorney Reference 47627.99.2060 andBI-2013/107 entitiled “DELIVERY, USE AND THERAPEUTIC APPLICATIONS OF THECRISPR-CAS SYSTEMS AND COMPOSITIONS FOR TARGETING DISORDERS AND DISEASESUSING PARTICLE DELIVERY COMPONENTS (claiming priority from one or moreor all of U.S. provisional patent applications: 62/054,490, filed Sep.24, 2014; 62/010,441, filed Jun. 10, 2014; and 61/915,118, 61/915,215and 61/915,148, each filed on Dec. 12, 2013) (“the Particle DeliveryPCT”), incorporated herein by reference, with respect to a method ofpreparing an sgRNA-and-Cas9 protein containing particle comprisingadmixing a mixture comprising an sgRNA and Cas9 protein (and optionallyHDR template) with a mixture comprising or consisting essentially of orconsisting of surfactant, phospholipid, biodegradable polymer,lipoprotein and alcohol; and particles from such a process. For example,wherein Cas9 protein and sgRNA were mixed together at a suitable, e.g.,3:1 to 1:3 or 2:1 to 1:2 or 1:1 molar ratio, at a suitable temperature,e.g., 15-30° C., e.g., 20-25° C., e.g., room temperature, for a suitabletime, e.g., 15-45, such as 30 minutes, advantageously in sterile,nuclease free buffer, e.g., 1× PBS. Separately, particle components suchas or comprising: a surfactant, e.g., cationic lipid, e.g.,1,2-dioleoyl-3-trimethylammonium-propane (DOTAP); phospholipid, e.g.,dimyristoylphosphatidylcholine (DMPC); biodegradable polymer, such as anethylene-glycol polymer or PEG, and a lipoprotein, such as a low-densitylipoprotein, e.g., cholesterol were dissolved in an alcohol,advantageously a C1-6 alkyl alcohol, such as methanol, ethanol,isopropanol, e.g., 100% ethanol. The two solutions were mixed togetherto form particles containing the Cas9-sgRNA complexes. Accordingly,sgRNA may be pre-complexed with the Cas9 protein, before formulating theentire complex in a particle. Formulations may be made with a differentmolar ratio of different components known to promote delivery of nucleicacids into cells (e.g. 1,2-dioleoyl-3-trimethylammonium-propane (DOTAP),1,2-ditetradecanoyl-sn-glycero-3-phosphocholine (DMPC), polyethyleneglycol (PEG), and cholesterol) For example DOTAP:DMPC:PEG:CholesterolMolar Ratios may be DOTAP 100, DMPC 0, PEG 0, Cholesterol 0; or DOTAP90, DMPC 0, PEG 10, Cholesterol 0; or DOTAP 90, DMPC 0, PEG 5,Cholesterol 5. DOTAP 100, DMPC 0, PEG 0, Cholesterol 0. That applicationaccordingly comprehends admixing sgRNA, Cas9 protein and components thatform a particle; as well as particles from such admixing. Aspects of theinstant invention can involve particles; for example, particles using aprocess analogous to that of the Particle Delivery PCT, e.g., byadmixing a mixture comprising sgRNA and/or Cas9 as in the instantinvention and components that form a particle, e.g., as in the ParticleDelivery PCT, to form a particle and particles from such admixing (or,of course, other particles involving sgRNA and/or Cas9 as in the instantinvention). These and other CRISPR-Cas or CRISPR systems can be used inthe practice of the invention.

Lentivirus

Lentiviruses are complex retroviruses that have the ability to infectand express their genes in both mitotic and post-mitotic cells. The mostcommonly known lentivirus is the human immunodeficiency virus (HIV),which uses the envelope glycoproteins of other viruses to target a broadrange of cell types.

Lentiviruses may be prepared as follows. After cloning pCasES10 (whichcontains a lentiviral transfer plasmid backbone), HEK293FT at lowpassage (p=5) were seeded in a T-75 flask to 50% confluence the daybefore transfection in DMEM with 10% fetal bovine serum and withoutantibiotics. After 20 hours, media was changed to OptiMEM (serum-free)media and transfection was done 4 hours later. Cells were transfectedwith 10 μg of lentiviral transfer plasmid (pCasES10) and the followingpackaging plasmids: 5 μg of pMD2.G (VSV-g pseudotype), and 7.5 μg ofpsPAX2 (gag/pol/rev/tat). Transfection was done in 4 mL OptiMEM with acationic lipid delivery agent (50 μL Lipofectamine 2000 and 100 μl Plusreagent). After 6 hours, the media was changed to antibiotic-free DMEMwith 10% fetal bovine serum.

Lentivirus may be purified as follows. Viral supernatants were harvestedafter 48 hours. Supernatants were first cleared of debris and filteredthrough a 0.45 μm low protein binding (PVDF) filter. They were then spunin a ultracentrifuge for 2 hours at 24,000 rpm. Viral pellets wereresuspended in 50 μl of DMEM overnight at 4° C. They were then aliquotedand immediately frozen at −80° C.

Other Perturbations

The invention also involves perturbing by subjecting the cell to anincrease or decrease in temperature. The temperature may range fromabout 0° C. to about 100° C., advantageously about 10° C., 15° C., 20°C., 25° C., 30° C., 35° C., 40° C., 45° C., 50° C., 55° C., 60° C., 65°C., 70° C., 75° C., 80° C., 85° C., 90° C., 95° C. or 100° C. In anotherembodiment, the temperature may be closer to a physiologicaltemperature, e.g., about 30° C., 31° C., 32° C., 33° C., 34° C., 35° C.,36° C., 37° C., 38° C., 39° C. or 40° C.

The invention also involves perturbing by subjecting the cell to achemical agent. Samples of chemical agents include, but are not limitedto, an antibiotic, monoclonal antibody, cancer therapeutic, directcellular toxin, a small molecule, a hormone, a hormone derivative, asteroid or a steroid derivative.

In one aspect of the invention the perturbing may be with an energysource such as electromagnetic energy or ultrasound. The electromagneticenergy may be a component of visible light having a wavelength in therange of 450 nm-700 nm. In a preferred embodiment the component ofvisible light may have a wavelength in the range of 450 nm-500 nm andmay be blue light. The blue light may have an intensity of at least 0.2mW/cm², or more preferably at least 4 mW/cm². In another embodiment, thecomponent of visible light may have a wavelength in the range of 620-700nm and is red light.

The invention also involves perturbing by subjecting the cell to achemical agent and/or temperature gradient. A biomolecular gradient maybe formed, for example, as reviewed in Keenan and Folch, Lab Chip. 2008January; 8(1): doi:10.1039/b711887b. Biomolecule gradients have beenshown to play roles in a wide range of biological processes includingdevelopment, inflammation, wound healing, and cancer metastasis.Elucidation of these phenomena requires the ability to expose cells tobiomolecule gradients that are quantifiable, controllable, and mimicthose that are present in vivo.

A chemical gradient may be formed without requiring fluid flow (see,e.g., Abhyankar et al., Lab Chip, 2006, 6, 389-393). This deviceconsists of a membrane-covered source region and a large volume sinkregion connected by a microfluidic channel. The high fluidic resistanceof the membrane limits fluid flow caused by pressure differences in thesystem, but allows diffusive transport of a chemical species through themembrane and into the channel. The large volume sink region at the endof the microfluidic channel helps to maintain spatial and temporalstability of the gradient. The chemical gradient in a 0.5 mm region nearthe sink region experiences a maximum of 10 percent change between the 6and 24 h data points. Abhyankar et al., Lab Chip, 2006, 6, 389-393present the theory, design, and characterization of this device andprovide an example of neutrophil chemotaxis as proof of concept forfuture quantitative cell-signaling applications.

In another embodiment, a gradient may also be introduced with nanowires.In this embodiment, the nanowires do not necessarily introduce agradient but may introduce other things into the system. A generalizedplatform for introducing a diverse range of biomolecules into livingcells in high-throughput could transform how complex cellular processesare probed and analyzed. Shalek et al., PNAS |Feb. 2, 2010 | vol. 107 |no. 5 demonstrate spatially localized, efficient, and universal deliveryof biomolecules into immortalized and primary mammalian cells usingsurface-modified vertical silicon nanowires. The method relies on theability of the silicon nanowires to penetrate a cell's membrane andsubsequently release surface-bound molecules directly into the cell'scytosol, thus allowing highly efficient delivery of biomolecules withoutchemical modification or viral packaging. This modality enables one toassess the phenotypic consequences of introducing a broad range ofbiological effectors (DNAs, RNAs, peptides, proteins, and smallmolecules) into almost any cell type. Shalek et al., PNAS | Feb. 2, 2010| vol. 107 | no. 5 show that this platform can be used to guide neuronalprogenitor growth with small molecules, knock down transcript levels bydelivering siRNAs, inhibit apoptosis using peptides, and introducetargeted proteins to specific organelles. . Shalek et al., PNAS | Feb.2, 2010 | vol. 107 | no. 5 further demonstrate codelivery of siRNAs andproteins on a single substrate in a microarray format, highlighting thistechnology's potential as a robust, monolithic platform forhigh-throughput, miniaturized bioassays.

A gradient may be established, for example, in a fluidic device, such asa microfluidic device (see, e.g., Tehranirokh et al., BIOMICROFLUIDICS7, 051502 (2013)). Microfluidic technology allows dynamic cell culturein microperfusion systems to deliver continuous nutrient supplies forlong term cell culture. It offers many opportunities to mimic thecell-cell and cell-extracellular matrix interactions of tissues bycreating gradient concentrations of biochemical signals such as growthfactors, chemokines, and hormones. Other applications of cellcultivation in microfluidic systems include high resolution cellpatterning on a modified substrate with adhesive patterns and thereconstruction of complicated tissue architectures. In the review ofTehranirokh et al., BIOMICROFLUIDICS 7, 051502 (2013), recent advancesin microfluidic platforms for cell culturing and proliferation, for bothsimple monolayer (2D) cell seeding processes and 3D configurations asaccurate models of in vivo conditions, are examined.

Drop-Sequence Methods (“Drop-Seq”)

Cells come in different types, sub-types and activity states, which areclassify based on their their shape, location, function, or molecularprofiles, such as the set of RNAs that they express. RNA profiling is inprinciple particularly informative, as cells express thousands ofdifferent RNAs. Approaches that measure for example the level of everytype of RNA have until recently been applied to “homogenized” samples—inwhich the contents of all the cells are mixed together. Methods toprofile the RNA content of tens and hundreds of thousands of individualhuman cells have been recently developed, including from brain tissues,quickly and inexpensively. To do so, special microfluidic devices havebeen developed to encapsulate each cell in an individual drop, associatethe RNA of each cell with a ‘cell barcode’ unique to that cell/drop,measure the expression level of each RNA with sequencing, and then usethe cell barcodes to determine which cell each RNA molecule came from.See, e.g., U.S. 62/048,227 filed Sep. 9, 2014].

Methods of Macosko et al., 2015, Cell 161, 1202-1214 and Klein et al.,2015, Cell 161, 1187-1201 are contemplated for the present invention.

Microfluidics involves micro-scale devices that handle small volumes offluids. Because microfluidics may accurately and reproducibly controland dispense small fluid volumes, in particular volumes less than 1 μl,application of microfluidics provides significant cost-savings. The useof microfluidics technology reduces cycle times, shortenstime-to-results, and increases throughput. Furthermore, incorporation ofmicrofluidics technology enhances system integration and automation.Microfluidic reactions are generally conducted in microdroplets ormicrowells. The ability to conduct reactions in microdroplets depends onbeing able to merge different sample fluids and different microdroplets.See, e.g., US Patent Publication No. 20120219947. See also internationalpatent application serial no. PCT/US2014/058637 for disclosure regardinga microfluidic laboratory on a chip.

Droplet/microwell microfluidics offers significant advantages forperforming high-throughput screens and sensitive assays. Droplets allowsample volumes to be significantly reduced, leading to concomitantreductions in cost. Manipulation and measurement at kilohertz speedsenable up to 108 discrete biological entities (including, but notlimited to, individual cells or organelles) to be screened in a singleday. Compartmentalization in droplets increases assay sensitivity byincreasing the effective concentration of rare species and decreasingthe time required to reach detection thresholds. Droplet microfluidicscombines these powerful features to enable currently inaccessiblehigh-throughput screening applications, including single-cell andsingle-molecule assays. See, e.g., Guo et al., Lab Chip, 2012,12,2146-2155.

Drop-Sequence methods and apparatus provides a high-throughputsingle-cell RNA-Seq and/or targeted nucleic acid profiling (for example,sequencing, quantitative reverse transcription polymerase chainreaction, and the like) where the RNAs from different cells are taggedindividually, allowing a single library to be created while retainingthe cell identity of each read. A combination of molecular barcoding andemulsion-based microfluidics to isolate, lyse, barcode, and preparenucleic acids from individual cells in high-throughput is used.Microfluidic devices (for example, fabricated in polydimethylsiloxane),sub-nanoliter reverse emulsion droplets. These droplets are used toco-encapsulate nucleic acids with a barcoded capture bead. Each bead,for example, is uniquely barcoded so that each drop and its contents aredistinguishable. The nucleic acids may come from any source known in theart, such as for example, those which come from a single cell, a pair ofcells, a cellular lysate, or a solution. The cell is lysed as it isencapsulated in the droplet. To load single cells and barcoded beadsinto these droplets with Poisson statistics, 100,000 to 10 million suchbeads are needed to barcode 10,000-100,000 cells.

The invention provides a method for creating a single-cell sequencinglibrary comprising: merging one uniquely barcoded mRNA capture microbeadwith a single-cell in an emulsion droplet having a diameter of 75-125μm; lysing the cell to make its RNA accessible for capturing byhybridization onto RNA capture microbead; performing a reversetranscription either inside or outside the emulsion droplet to convertthe cell's mRNA to a first strand cDNA that is covalently linked to themRNA capture microbead; pooling the cDNA-attached microbeads from allcells; and preparing and sequencing a single composite RNA-Seq library.

The invention provides a method for preparing uniquely barcoded mRNAcapture microbeads, which has a unique barcode and diameter suitable formicrofluidic devices comprising: 1) performing reverse phosphoramiditesynthesis on the surface of the bead in a pool-and-split fashion, suchthat in each cycle of synthesis the beads are split into four reactionswith one of the four canonical nucleotides (T, C, G, or A) or uniqueoligonucleotides of length two or more bases; 2) repeating this processa large number of times, at least two, and optimally more than twelve,such that, in the latter, there are more than 16 million unique barcodeson the surface of each bead in the pool. (Seehttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC206447).

Generally, the invention provides a method for preparing a large numberof beads, particles, microbeads, nanoparticles, or the like with uniquenucleic acid barcodes comprising performing polynucleotide synthesis onthe surface of the beads in a pool-and-split fashion such that in eachcycle of synthesis the beads are split into subsets that are subjectedto different chemical reactions; and then repeating this split-poolprocess in two or more cycles, to produce a combinatorially large numberof distinct nucleic acid barcodes. Invention further provides performinga polynucleotide synthesis wherein the synthesis may be any type ofsynthesis known to one of skill in the art for “building” polynucleotidesequences in a step-wise fashion. Examples include, but are not limitedto, reverse direction synthesis with phosphoramidite chemistry orforward direction synthesis with phosphoramidite chemistry. Previous andwell-known methods synthesize the oligonucleotides separately then“glue” the entire desired sequence onto the bead enzymatically.Applicants present a complexed bead and a novel process for producingthese beads where nucleotides are chemically built onto the beadmaterial in a high-throughput manner. Moreover, Applicants generallydescribe delivering a “packet” of beads which allows one to delivermillions of sequences into separate compartments and then screen all atonce.

The invention further provides an apparatus for creating a single-cellsequencing library via a microfluidic system, comprising: aoil-surfactant inlet comprising a filter and a carrier fluid channel,wherein said carrier fluid channel further comprises a resistor; aninlet for an analyte comprising a filter and a carrier fluid channel,wherein said carrier fluid channel further comprises a resistor; aninlet for mRNA capture microbeads and lysis reagent comprising a filterand a carrier fluid channel, wherein said carrier fluid channel furthercomprises a resistor; said carrier fluid channels have a carrier fluidflowing therein at an adjustable or predetermined flow rate; whereineach said carrier fluid channels merge at a junction; and said junctionbeing connected to a mixer, which contains an outlet for drops.

A mixture comprising a plurality of microbeads adorned with combinationsof the following elements: bead-specific oligonucleotide barcodescreated by the discussed methods; additional oligonucleotide barcodesequences which vary among the oligonucleotides on an individual beadand can therefore be used to differentiate or help identify thoseindividual oligonucleotide molecules; additional oligonucleotidesequences that create substrates for downstream molecular-biologicalreactions, such as oligo-dT (for reverse transcription of mature mRNAs),specific sequences (for capturing specific portions of thetranscriptome, or priming for DNA polymerases and similar enzymes), orrandom sequences (for priming throughout the transcriptome or genome).In an embodiment, the individual oligonucleotide molecules on thesurface of any individual microbead contain all three of these elements,and the third element includes both oligo-dT and a primer sequence.

Examples of the labeling substance which may be employed includelabeling substances known to those skilled in the art, such asfluorescent dyes, enzymes, coenzymes, chemiluminescent substances, andradioactive substances. Specific examples include radioisotopes (e.g.,32P, 14C, 125I, 3H, and 131I), fluorescein, rhodamine, dansyl chloride,umbelliferone, luciferase, peroxidase, alkaline phosphatase,β-galactosidase, β-glucosidase, horseradish peroxidase, glucoamylase,lysozyme, saccharide oxidase, microperoxidase, biotin, and ruthenium. Inthe case where biotin is employed as a labeling substance, preferably,after addition of a biotin-labeled antibody, streptavidin bound to anenzyme (e.g., peroxidase) is further added.

Advantageously, the label is a fluorescent label. Examples offluorescent labels include, but are not limited to, Atto dyes,4-acetamido-4′-isothiocyanatostilbene-2,2′-disulfonic acid; acridine andderivatives: acridine, acridine isothiocyanate;5-(2′-aminoethyl)aminonaphthalene-1-sulfonic acid (EDANS);4-amino-N-[3-vinyl sulfonyl)phenyl]naphthalimide-3,5 disulfonate;N-(4-anilino-1-naphthyl)maleimide; anthranilamide; BODIPY; BrilliantYellow; coumarin and derivatives; coumarin, 7-amino-4-methylcoumarin(AMC, Coumarin 120), 7-amino-4-trifluoromethylcoumarin (Coumaran 151);cyanine dyes; cyanosine; 4′,6- diamidino-2-phenylindole (DAPI);5′5″-dibromopyrogallol-sulfonaphthalein (Bromopyrogallol Red);7-diethylamino-3 -(4′-isothiocyanatophenyl)-4-methylcoumarin;diethylenetriamine pentaacetate;4,4′-diisothiocyanatodihydro-stilbene-2,2′-disulfonic acid;4,4′-diisothiocyanatostilbene-2,2′-disulfonic acid;5-[dimethylamino]naphthalene-1-sulfonyl chloride (DNS, dansylchloride);4-dimethylaminophenylazophenyl-4′-isothiocyanate (DABITC); eosin andderivatives; eosin, eosin isothiocyanate, erythrosin and derivatives;erythrosin B, erythrosin, isothiocyanate; ethidium; fluorescein andderivatives; 5-carboxyfluorescein (FAM),5-(4,6-dichlorotriazin-2-yl)aminofluorescein (DTAF),2′,7′-dimethoxy-4′5′-dichloro-6-carboxyfluorescein, fluorescein,fluorescein isothiocyanate, QFITC, (XRITC); fluorescamine; IR144;IR1446; Malachite Green isothiocyanate; 4-methylumbelliferoneorthocresolphthalein; nitrotyrosine; pararosaniline; Phenol Red;B-phycoerythrin; o-phthaldialdehyde; pyrene and derivatives: pyrene,pyrene butyrate, succinimidyl 1-pyrene; butyrate quantum dots; ReactiveRed 4 (Cibacron.TM. Brilliant Red 3B-A) rhodamine and derivatives:6-carboxy-X-rhodamine (ROX), 6-carboxyrhodamine (R6G), lissaminerhodamine B sulfonyl chloride rhodamine (Rhod), rhodamine B, rhodamine123, rhodamine X isothiocyanate, sulforhodamine B, sulforhodamine 101,sulfonyl chloride derivative of sulforhodamine 101 (Texas Red);N,N,N′,N′ tetramethyl-6-carboxyrhodamine (TAMRA); tetramethyl rhodamine;tetramethyl rhodamine isothiocyanate (TRITC); riboflavin; rosolic acid;terbium chelate derivatives; Cy3; Cy5; Cy5.5; Cy7; IRD 700; IRD 800; LaJolta Blue; phthalo cyanine; and naphthalo cyanine.

The fluorescent label may be a fluorescent protein, such as bluefluorescent protein, cyan fluorescent protein, green fluorescentprotein, red fluorescent protein, yellow fluorescent protein or anyphotoconvertible protein. Colorimetric labeling, bioluminescent labelingand/or chemiluminescent labeling may further accomplish labeling.Labeling further may include energy transfer between molecules in thehybridization complex by perturbation analysis, quenching, or electrontransport between donor and acceptor molecules, the latter of which maybe facilitated by double stranded match hybridization complexes. Thefluorescent label may be a perylene or a terylene. In the alternative,the fluorescent label may be a fluorescent bar code.

In an advantageous embodiment, the label may be light sensitive, whereinthe label is light-activated and/or light cleaves the one or morelinkers to release the molecular cargo. The light-activated molecularcargo may be a major light-harvesting complex (LHCII). In anotherembodiment, the fluorescent label may induce free radical formation.

The invention discussed herein enables high throughput and highresolution delivery of reagents to individual emulsion droplets that maycontain cells, organelles, nucleic acids, proteins, etc. through the useof monodisperse aqueous droplets that are generated by a microfluidicdevice as a water-in-oil emulsion. The droplets are carried in a flowingoil phase and stabilized by a surfactant. In one aspect single cells orsingle organelles single molecules (proteins, RNA, DNA) are encapsulatedinto uniform droplets from an aqueous solution/dispersion. In a relatedaspect, multiple cells or multiple molecules may take the place ofsingle cells or single molecules. The aqueous droplets of volume rangingfrom 1 pL to 10 nL work as individual reactors. Disclosed embodimentsprovide 10⁴ to 10⁵ single cells in droplets which can be processed andanalyzed in a single run.

To utilize microdroplets for rapid large-scale chemical screening orcomplex biological library identification, different species ofmicrodroplets, each containing the specific chemical compounds orbiological probes cells or molecular barcodes of interest, have to begenerated and combined at the preferred conditions, e.g., mixing ratio,concentration, and order of combination.

Each species of droplet is introduced at a confluence point in a mainmicrofluidic channel from separate inlet microfluidic channels.Preferably, droplet volumes are chosen by design such that one speciesis larger than others and moves at a different speed, usually slowerthan the other species, in the carrier fluid, as disclosed in U.S.Publication No. US 2007/0195127 and International Publication No. WO2007/089541, each of which are incorporated herein by reference in theirentirety. The channel width and length is selected such that fasterspecies of droplets catch up to the slowest species. Size constraints ofthe channel prevent the faster moving droplets from passing the slowermoving droplets resulting in a train of droplets entering a merge zone.Multi-step chemical reactions, biochemical reactions, or assay detectionchemistries often require a fixed reaction time before species ofdifferent type are added to a reaction. Multi-step reactions areachieved by repeating the process multiple times with a second, third ormore confluence points each with a separate merge point. Highlyefficient and precise reactions and analysis of reactions are achievedwhen the frequencies of droplets from the inlet channels are matched toan optimized ratio and the volumes of the species are matched to provideoptimized reaction conditions in the combined droplets.

Fluidic droplets may be screened or sorted within a fluidic system ofthe invention by altering the flow of the liquid containing thedroplets. For instance, in one set of embodiments, a fluidic droplet maybe steered or sorted by directing the liquid surrounding the fluidicdroplet into a first channel, a second channel, etc. In another set ofembodiments, pressure within a fluidic system, for example, withindifferent channels or within different portions of a channel, can becontrolled to direct the flow of fluidic droplets. For example, adroplet can be directed toward a channel junction including multipleoptions for further direction of flow (e.g., directed toward a branch,or fork, in a channel defining optional downstream flow channels).Pressure within one or more of the optional downstream flow channels canbe controlled to direct the droplet selectively into one of thechannels, and changes in pressure can be effected on the order of thetime required for successive droplets to reach the junction, such thatthe downstream flow path of each successive droplet can be independentlycontrolled. In one arrangement, the expansion and/or contraction ofliquid reservoirs may be used to steer or sort a fluidic droplet into achannel, e.g., by causing directed movement of the liquid containing thefluidic droplet. In another embodiment, the expansion and/or contractionof the liquid reservoir may be combined with other flow-controllingdevices and methods, e.g., as discussed herein. Non-limiting examples ofdevices able to cause the expansion and/or contraction of a liquidreservoir include pistons.

Key elements for using microfluidic channels to process dropletsinclude: (1) producing droplet of the correct volume, (2) producingdroplets at the correct frequency and (3) bringing together a firststream of sample droplets with a second stream of sample droplets insuch a way that the frequency of the first stream of sample dropletsmatches the frequency of the second stream of sample droplets.Preferably, bringing together a stream of sample droplets with a streamof premade library droplets in such a way that the frequency of thelibrary droplets matches the frequency of the sample droplets.

Methods for producing droplets of a uniform volume at a regularfrequency are well known in the art. One method is to generate dropletsusing hydrodynamic focusing of a dispersed phase fluid and immisciblecarrier fluid, such as disclosed in U.S. Publication No. US 2005/0172476and International Publication No. WO 2004/002627. It is desirable forone of the species introduced at the confluence to be a pre-made libraryof droplets where the library contains a plurality of reactionconditions, e.g., a library may contain plurality of different compoundsat a range of concentrations encapsulated as separate library elementsfor screening their effect on cells or enzymes, alternatively a librarycould be composed of a plurality of different primer pairs encapsulatedas different library elements for targeted amplification of a collectionof loci, alternatively a library could contain a plurality of differentantibody species encapsulated as different library elements to perform aplurality of binding assays. The introduction of a library of reactionconditions onto a substrate is achieved by pushing a premade collectionof library droplets out of a vial with a drive fluid. The drive fluid isa continuous fluid. The drive fluid may comprise the same substance asthe carrier fluid (e.g., a fluorocarbon oil). For example, if a libraryconsists of ten pico-liter droplets is driven into an inlet channel on amicrofluidic substrate with a drive fluid at a rate of 10,000pico-liters per second, then nominally the frequency at which thedroplets are expected to enter the confluence point is 1000 per second.However, in practice droplets pack with oil between them that slowlydrains. Over time the carrier fluid drains from the library droplets andthe number density of the droplets (number/mL) increases. Hence, asimple fixed rate of infusion for the drive fluid does not provide auniform rate of introduction of the droplets into the microfluidicchannel in the substrate. Moreover, library-to-library variations in themean library droplet volume result in a shift in the frequency ofdroplet introduction at the confluence point. Thus, the lack ofuniformity of droplets that results from sample variation and oildrainage provides another problem to be solved. For example if thenominal droplet volume is expected to be 10 pico-liters in the library,but varies from 9 to 11 pico-liters from library-to-library then a10,000 pico-liter/second infusion rate will nominally produce a range infrequencies from 900 to 1,100 droplet per second. In short, sample tosample variation in the composition of dispersed phase for droplets madeon chip, a tendency for the number density of library droplets toincrease over time and library-to-library variations in mean dropletvolume severely limit the extent to which frequencies of droplets may bereliably matched at a confluence by simply using fixed infusion rates.In addition, these limitations also have an impact on the extent towhich volumes may be reproducibly combined. Combined with typicalvariations in pump flow rate precision and variations in channeldimensions, systems are severely limited without a means to compensateon a run-to-run basis. The foregoing facts not only illustrate a problemto be solved, but also demonstrate a need for a method of instantaneousregulation of microfluidic control over microdroplets within amicrofluidic channel.

Combinations of surfactant(s) and oils must be developed to facilitategeneration, storage, and manipulation of droplets to maintain the uniquechemical/biochemical/biological environment within each droplet of adiverse library. Therefore, the surfactant and oil combination must (1)stabilize droplets against uncontrolled coalescence during the dropforming process and subsequent collection and storage, (2) minimizetransport of any droplet contents to the oil phase and/or betweendroplets, and (3) maintain chemical and biological inertness withcontents of each droplet (e.g., no adsorption or reaction ofencapsulated contents at the oil-water interface, and no adverse effectson biological or chemical constituents in the droplets). In addition tothe requirements on the droplet library function and stability, thesurfactant-in-oil solution must be coupled with the fluid physics andmaterials associated with the platform. Specifically, the oil solutionmust not swell, dissolve, or degrade the materials used to construct themicrofluidic chip, and the physical properties of the oil (e.g.,viscosity, boiling point, etc.) must be suited for the flow andoperating conditions of the platform.

Droplets formed in oil without surfactant are not stable to permitcoalescence, so surfactants must be dissolved in the oil that is used asthe continuous phase for the emulsion library. Surfactant molecules areamphiphilic--part of the molecule is oil soluble, and part of themolecule is water soluble. When a water-oil interface is formed at thenozzle of a microfluidic chip for example in the inlet module discussedherein, surfactant molecules that are dissolved in the oil phase adsorbto the interface. The hydrophilic portion of the molecule resides insidethe droplet and the fluorophilic portion of the molecule decorates theexterior of the droplet. The surface tension of a droplet is reducedwhen the interface is populated with surfactant, so the stability of anemulsion is improved. In addition to stabilizing the droplets againstcoalescence, the surfactant should be inert to the contents of eachdroplet and the surfactant should not promote transport of encapsulatedcomponents to the oil or other droplets.

A droplet library may be made up of a number of library elements thatare pooled together in a single collection (see, e.g., US PatentPublication No. 2010002241). Libraries may vary in complexity from asingle library element to 1015 library elements or more. Each libraryelement may be one or more given components at a fixed concentration.The element may be, but is not limited to, cells, organelles, virus,bacteria, yeast, beads, amino acids, proteins, polypeptides, nucleicacids, polynucleotides or small molecule chemical compounds. The elementmay contain an identifier such as a label. The terms “droplet library”or “droplet libraries” are also referred to herein as an “emulsionlibrary” or “emulsion libraries.” These terms are used interchangeablythroughout the specification.

A cell library element may include, but is not limited to, hybridomas,B-cells, primary cells, cultured cell lines, cancer cells, stem cells,cells obtained from tissue, or any other cell type. Cellular libraryelements are prepared by encapsulating a number of cells from one tohundreds of thousands in individual droplets. The number of cellsencapsulated is usually given by Poisson statistics from the numberdensity of cells and volume of the droplet. However, in some cases thenumber deviates from Poisson statistics as discussed in Edd et al.,“Controlled encapsulation of single-cells into monodisperse picolitredrops.” Lab Chip, 8(8): 1262-1264, 2008. The discrete nature of cellsallows for libraries to be prepared in mass with a plurality of cellularvariants all present in a single starting media and then that media isbroken up into individual droplet capsules that contain at most onecell. These individual droplets capsules are then combined or pooled toform a library consisting of unique library elements. Cell divisionsubsequent to, or in some embodiments following, encapsulation producesa clonal library element.

A bead based library element may contain one or more beads, of a giventype and may also contain other reagents, such as antibodies, enzymes orother proteins. In the case where all library elements contain differenttypes of beads, but the same surrounding media, the library elements mayall be prepared from a single starting fluid or have a variety ofstarting fluids. In the case of cellular libraries prepared in mass froma collection of variants, such as genomically modified, yeast orbacteria cells, the library elements will be prepared from a variety ofstarting fluids.

Often it is desirable to have exactly one cell per droplet with only afew droplets containing more than one cell when starting with aplurality of cells or yeast or bacteria, engineered to produce variantson a protein. In some cases, variations from Poisson statistics may beachieved to provide an enhanced loading of droplets such that there aremore droplets with exactly one cell per droplet and few exceptions ofempty droplets or droplets containing more than one cell.

Examples of droplet libraries are collections of droplets that havedifferent contents, ranging from beads, cells, small molecules, DNA,primers, antibodies. Smaller droplets may be in the order of femtoliter(fL) volume drops, which are especially contemplated with the dropletdispenser. The volume may range from about 5 to about 600 fL. The largerdroplets range in size from roughly 0.5 micron to 500 micron indiameter, which corresponds to about 1 pico liter to 1 nano liter.However, droplets may be as small as 5 microns and as large as 500microns. Preferably, the droplets are at less than 100 microns, about 1micron to about 100 microns in diameter. The most preferred size isabout 20 to 40 microns in diameter (10 to 100 picoliters). The preferredproperties examined of droplet libraries include osmotic pressurebalance, uniform size, and size ranges.

The droplets comprised within the emulsion libraries of the presentinvention may be contained within an immiscible oil which may compriseat least one fluorosurfactant. In some embodiments, the fluorosurfactantcomprised within immiscible fluorocarbon oil is a block copolymerconsisting of one or more perfluorinated polyether (PFPE) blocks and oneor more polyethylene glycol (PEG) blocks. In other embodiments, thefluorosurfactant is a triblock copolymer consisting of a PEG centerblock covalently bound to two PFPE blocks by amide linking groups. Thepresence of the fluorosurfactant (similar to uniform size of thedroplets in the library) is critical to maintain the stability andintegrity of the droplets and is also essential for the subsequent useof the droplets within the library for the various biological andchemical assays discussed herein. Fluids (e.g., aqueous fluids,immiscible oils, etc.) and other surfactants that may be utilized in thedroplet libraries of the present invention are discussed in greaterdetail herein.

The present invention provides an emulsion library which may comprise aplurality of aqueous droplets within an immiscible oil (e.g.,fluorocarbon oil) which may comprise at least one fluorosurfactant,wherein each droplet is uniform in size and may comprise the sameaqueous fluid and may comprise a different library element. The presentinvention also provides a method for forming the emulsion library whichmay comprise providing a single aqueous fluid which may comprisedifferent library elements, encapsulating each library element into anaqueous droplet within an immiscible fluorocarbon oil which may compriseat least one fluorosurfactant, wherein each droplet is uniform in sizeand may comprise the same aqueous fluid and may comprise a differentlibrary element, and pooling the aqueous droplets within an immisciblefluorocarbon oil which may comprise at least one fluorosurfactant,thereby forming an emulsion library.

For example, in one type of emulsion library, all different types ofelements (e.g., cells or beads), may be pooled in a single sourcecontained in the same medium. After the initial pooling, the cells orbeads are then encapsulated in droplets to generate a library ofdroplets wherein each droplet with a different type of bead or cell is adifferent library element. The dilution of the initial solution enablesthe encapsulation process. In some embodiments, the droplets formed willeither contain a single cell or bead or will not contain anything, i.e.,be empty. In other embodiments, the droplets formed will containmultiple copies of a library element. The cells or beads beingencapsulated are generally variants on the same type of cell or bead. Inone example, the cells may comprise cancer cells of a tissue biopsy, andeach cell type is encapsulated to be screened for genomic data oragainst different drug therapies. Another example is that 1011 or 1015different type of bacteria; each having a different plasmid splicedtherein, are encapsulated. One example is a bacterial library where eachlibrary element grows into a clonal population that secretes a varianton an enzyme.

In another example, the emulsion library may comprise a plurality ofaqueous droplets within an immiscible fluorocarbon oil, wherein a singlemolecule may be encapsulated, such that there is a single moleculecontained within a droplet for every 20-60 droplets produced (e.g., 20,25, 30, 35, 40, 45, 50, 55, 60 droplets, or any integer in between).Single molecules may be encapsulated by diluting the solution containingthe molecules to such a low concentration that the encapsulation ofsingle molecules is enabled. In one specific example, a LacZ plasmid DNAwas encapsulated at a concentration of 20 fM after two hours ofincubation such that there was about one gene in 40 droplets, where 10μm droplets were made at 10 kHz per second. Formation of these librariesrely on limiting dilutions.

Methods of the invention involve forming sample droplets. The dropletsare aqueous droplets that are surrounded by an immiscible carrier fluid.Methods of forming such droplets are shown for example in Link et al.(U.S. patent application numbers 2008/0014589, 2008/0003142, and2010/0137163), Stone et al. (U.S. Pat. No. 7,708,949 and U.S. patentapplication number 2010/0172803), Anderson et al. (U.S. Pat. No.7,041,481 and which reissued as RE41,780) and European publicationnumber EP2047910 to Raindance Technologies Inc. The content of each ofwhich is incorporated by reference herein in its entirety.

In certain embodiments, the carrier fluid may contain one or moreadditives, such as agents which reduce surface tensions (surfactants).Surfactants can include Tween, Span, fluorosurfactants, and other agentsthat are soluble in oil relative to water. In some applications,performance is improved by adding a second surfactant to the samplefluid. Surfactants can aid in controlling or optimizing droplet size,flow and uniformity, for example by reducing the shear force needed toextrude or inject droplets into an intersecting channel. This can affectdroplet volume and periodicity, or the rate or frequency at whichdroplets break off into an intersecting channel. Furthermore, thesurfactant can serve to stabilize aqueous emulsions in fluorinated oilsfrom coalescing.

In certain embodiments, the droplets may be surrounded by a surfactantwhich stabilizes the droplets by reducing the surface tension at theaqueous oil interface. Preferred surfactants that may be added to thecarrier fluid include, but are not limited to, surfactants such assorbitan-based carboxylic acid esters (e.g., the “Span” surfactants,Fluka Chemika), including sorbitan monolaurate (Span 20), sorbitanmonopalmitate (Span 40), sorbitan monostearate (Span 60) and sorbitanmonooleate (Span 80), and perfluorinated polyethers (e.g., DuPont Krytox157 FSL, FSM, and/or FSH). Other non-limiting examples of non-ionicsurfactants which may be used include polyoxyethylenated alkylphenols(for example, nonyl-, p-dodecyl-, and dinonylphenols),polyoxyethylenated straight chain alcohols, polyoxyethylenatedpolyoxypropylene glycols, polyoxyethylenated mercaptans, long chaincarboxylic acid esters (for example, glyceryl and polyglyceryl esters ofnatural fatty acids, propylene glycol, sorbitol, polyoxyethylenatedsorbitol esters, polyoxyethylene glycol esters, etc.) and alkanolamines(e.g., diethanolamine-fatty acid condensates and isopropanolamine-fattyacid condensates).

By incorporating a plurality of unique tags into the additional dropletsand joining the tags to a solid support designed to be specific to theprimary droplet, the conditions that the primary droplet is exposed tomay be encoded and recorded. For example, nucleic acid tags can besequentially ligated to create a sequence reflecting conditions andorder of same. Alternatively, the tags can be added independentlyappended to solid support. Non-limiting examples of a dynamic labelingsystem that may be used to bioinformatically record information can befound at US Provisional Patent Application entitled “Compositions andMethods for Unique Labeling of Agents” filed Sep. 21, 2012 and Nov. 29,2012. In this way, two or more droplets may be exposed to a variety ofdifferent conditions, where each time a droplet is exposed to acondition, a nucleic acid encoding the condition is added to the dropleteach ligated together or to a unique solid support associated with thedroplet such that, even if the droplets with different histories arelater combined, the conditions of each of the droplets are remainavailable through the different nucleic acids. Non-limiting examples ofmethods to evaluate response to exposure to a plurality of conditionscan be found at US Provisional Patent Application entitled “Systems andMethods for Droplet Tagging” filed Sep. 21, 2012.

Applications of the disclosed device may include use for the dynamicgeneration of molecular barcodes (e.g., DNA oligonucleotides,fluorophores, etc.) either independent from or in concert with thecontrolled delivery of various compounds of interest (drugs, smallmolecules, siRNA, CRISPR guide RNAs, reagents, etc.). For example,unique molecular barcodes can be created in one array of nozzles whileindividual compounds or combinations of compounds can be generated byanother nozzle array. Barcodes/compounds of interest can then be mergedwith cell-containing droplets. An electronic record in the form of acomputer log file is kept to associate the barcode delivered with thedownstream reagent(s) delivered. This methodology makes it possible toefficiently screen a large population of cells for applications such assingle-cell drug screening, controlled perturbation of regulatorypathways, etc. The device and techniques of the disclosed inventionfacilitate efforts to perform studies that require data resolution atthe single cell (or single molecule) level and in a cost effectivemanner. Disclosed embodiments provide a high throughput and highresolution delivery of reagents to individual emulsion droplets that maycontain cells, nucleic acids, proteins, etc. through the use ofmonodisperse aqueous droplets that are generated one by one in amicrofluidic chip as a water-in-oil emulsion. Hence, the inventionproves advantageous over prior art systems by being able to dynamicallytrack individual cells and droplet treatments/combinations during lifecycle experiments. Additional advantages of the disclosed inventionprovides an ability to create a library of emulsion droplets on demandwith the further capability of manipulating the droplets through thedisclosed process(es). Disclosed embodiments may, thereby, providedynamic tracking of the droplets and create a history of dropletdeployment and application in a single cell based environment.

Droplet generation and deployment is produced via a dynamic indexingstrategy and in a controlled fashion in accordance with disclosedembodiments of the present invention. Disclosed embodiments of themicrofluidic device discussed herein provides the capability ofmicrodroplets that be processed, analyzed and sorted at a highlyefficient rate of several thousand droplets per second, providing apowerful platform which allows rapid screening of millions of distinctcompounds, biological probes, proteins or cells either in cellularmodels of biological mechanisms of disease, or in biochemical, orpharmacological assays.

Well-Based Biological Analysis (“Seq-Well”)

The well-based biological analysis platform, also referred to asSeq-well, facilitates the creation of barcoded single-cell sequencinglibraries from thousands of single cells using a device that contains100,000 40-micron wells. Importantly, single beads can be loaded intoeach microwell with a low frequency of duplicates due to size exclusion(average bead diameter 35 μm). By using a microwell array, loadingefficiency is greatly increased compared to drop-seq, which requirespoisson loading of beads to avoid duplication at the expense ofincreased cell input requirements. Seq-well, however, is capable ofcapturing nearly 100% of cells applied to the surface of the device.

Seq-well is a methodology which allows attachment of a porous membraneto a container in conditions which are benign to living cells. Combinedwith arrays of picoliter-scale volume containers made, for example, inPDMS, the platform provides the creation of hundreds of thousands ofisolated dialysis chambers which can be used for many differentapplications. The platform also provides single cell lysis proceduresfor single cell RNA-seq, whole genome amplification or proteome capture;highly multiplexed single cell nucleic acid preparation (˜100× increaseover current approaches); highly parallel growth of clonal bacterialpopulations thus providing synthetic biology applications as well asbasic recombinant protein expression; selection of bacterial that haveincreased secretion of a recombinant product possible product could alsobe small molecule metabolite which could have considerable utility inchemical industry and biofuels; retention of cells during multiplemicroengraving events'; long term capture of secreted products fromsingle cells; and screening of cellular events. Principles of thepresent methodology allow for addition and subtraction of materials fromthe containers, which has not previously been available on the presentscale in other modalities.

Seq-Well also enables stable attachment (through multiple establishedchemistries) of porous membranes to PDMS nanowell devices in conditionsthat do not affect cells. Based on requirements for downstream assays,amines are functionalized to the PDMS device and oxidized to themembrane with plasma. With regard to general cell culture uses, the PDMSis amine functionalized by air plasma treatment followed by submersionin an aqueous solution of poly(lysine) followed by baking at 80° C. Forprocesses that require robust denaturing conditions, the amine must becovalently linked to the surface. This is accomplished by treating thePDMS with air plasma, followed by submersion in an ethanol solution ofamine-silane, followed by baking at 80° C., followed by submersion in0.2% phenylene diisothiocyanate (PDITC) DMF/pyridine solution, followedby baking, followed by submersion in chitosan or poly(lysine) solution.For functionalization of the membrane for protein capture, membrane canbe amine-silanized using vapor deposition and then treated in solutionwith NHS-biotin or NHS-maleimide to turn the amine groups into thecrosslinking species.

After functionalization, the devices is loaded with cells (bacterial,mammalian or yeast) in compatible buffers. The cell laden device is thenbrought in contact with the functionalized membrane using a clampingdevice. A plain glass slide is placed on top of the membrane in theclamp to provide force for bringing the two surfaces together. After anhour incubation, as one hour is a preferred time span, the clamp isopened and the glass slide is removed. The device can then be submergedin any aqueous buffer for days without the membrane detaching, enablingrepetitive measurements of the cells without any cell loss. Thecovalently-linked membrane is stable in many harsh buffers includingguanidine hydrochloride which can be used to robustly lyse cells. If thepore size of the membrane is small, the products from the lysed cellswill be retained in each well. The lysing buffer can be washed out andreplaced with a different buffer which allows binding of biomolecules toprobes preloaded in the wells. The membrane can then be removed,enabling addition of enzymes to reverse transcribe or amplify nucleicacids captured in the wells after lysis. Importantly, the chemistryenables removal of one membrane and replacement with a membrane with adifferent pore size to enable integration of multiple activities on thesame array.

As discussed, while the platform has been optimized for the generationof individually barcoded single-cell sequencing libraries followingconfinement of cells and mRNA capture beads (Macosko, et al. 2015), itis capable of multiple levels of data acquisition. The platform iscompatible with other assays and measurements performed with the samearray. For example, profiling of human antibody responses by integratedsingle-cell analysis is discussed with regard to measuring levels ofcell surface proteins (Ogunniyi, A. O., B. A. Thomas, T. J. Politano, N.Varadarajan, E. Landais, P. Poignard, B. D. Walker, D. S. Kwon, and J.C. Love, “Profiling Human Antibody Responses by Integrated Single-CellAnalysis” Vaccine, 32(24), 2866-2873.) The authors demonstrate acomplete characterization of the antigen-specific B cells induced duringinfections or following vaccination, which enables and informs one ofskill in the art how interventions shape protective humoral responses.Specifically, this disclosure combines single-cell profiling withon-chip image cytometry, microengraving, and single-cell RT-PCR.

Undersampling—A Sampling Based Framework for Genetic Interactions

According to the invention, random sampling may comprise matrixcompletion, tensor completion, compressed sensing, or kernel learning.

In some aspects, where random sampling comprises matrix completion,tensor completion, or compressed sensing, π may be of the order of logP.

The invention relies on a random sampling assumption, e.g. that thecombinatorial space is sparse and/or of low rank. This assumption isgeneric and advantageously does not rely on the pre-determination of a(known) set of genetic interactions. This assumption constrains therange or complexity of models, and thus can be used to restrict samplingsize (undersampling). Further, as detailed below, the invention relieson the following:: (1) Given a limited number of assays, if one wishesto infer interactions up to an order j, it is advantageous to randomlysample interactions at a higher order k>j, because higher orderperturbations maximize the information that can be recovered; and (2) insuch a method, one may use a model that accounts for higher orderinteractions when analyzing lower order ones. For example, it ispossible to aim for each perturbation to target k˜5-7 genes at once toestimate/model interactions at lower order j˜3-5.

Although some experimental methods open the way to test non-linearinteractions by high order combinatorial genetic perturbations,exhaustive combinatorial exploration is intractable for anything but 2-or 3-way interactions for a few genes.

According to the invention, random matrix theory and compressive sensingmay be used to re-formulate this as a random sampling problem,developing a new framework from experimental design to model inference,testing and refinement.

To infer combinatorial models from a dramatic under-sampling of the fullhigh-order combinatorial space with massively combinatorial molecularperturbations (MCPP), one may rely on random matrix theory, compressivesensing and kernel learning.

According to the invention, it is made possible to model non-linearregulatory functions from genetic manipulations (perturbations).

One may learn models of higher-order genetic interactions fromcombinatorial perturbations with single cell profiling. Although thelearning problem is underdetermined due to combinatorial explosion(2^(m) possible interaction terms among m genes), it can becometractable in the presence of additional structure, including sparsityand smoothness, that constrains the range or complexity of models. Onemay thus rely on the following: (1) Given a limited number of assays, ifone wishes to infer interactions up to an order j, it is advantageous torandomly sample interactions at a higher order k>j, because higher orderperturbations maximize the information that can be recovered; and (2) insuch a design, one can use a model that accounts for higher orderinteractions when analyzing lower order ones. One may for example aimfor each perturbation to target k˜5-7 genes at once to estimateinteractions at j˜3-5.

Thus the present invention relies on a learning approach that takesmultiplex perturbations at a high order n and a complex readout data(e.g., RNA profile) and infers a model of genetic interactions at alower order (m<n),as well as strategies for experimental design, modeltesting and refinement.

If one assumes that genetic interactions are low rank, sparse, or both,then the true number of degrees of freedom is small relative to thecomplete combinatorial expansion, so that one can infer the fullnonlinear landscape with a relatively small random sampling ofhigh-order perturbations, without specific knowledge of which genes arelikely to interact. Analysis of prior studies supports the sparsityassumption in yeast (for fitness: Costanzo, M., Baryshnikova, A.,Bellay, J., Kim, Y., Spear, E. D., Sevier, C. S., Ding, H., Koh, J. L.,Toufighi, K., Mostafavi, S., Prinz, J., St Onge, R. P., VanderSluis, B.,Makhnevych, T., Vizeacoumar, F. J., Alizadeh, S., Bahr, S., Brost, R.L., Chen, Y., Cokol, M., Deshpande, R., Li, Z., Lin, Z. Y., Liang, W.,Marback, M., Paw, J., San Luis, B. J., Shuteriqi, E., Tong, A. H., vanDyk, N., Wallace, I. M., Whitney, J. A., Weirauch, M. T., Zhong, G.,Zhu, H., Houry, W. A., Brudno, M., Ragibizadeh, S., Papp, B., Pal, C.,Roth, F. P., Giaever, G., Nislow, C., Troyanskaya, O. G., Bussey, H.,Bader, G. D., Gingras, A. C., Morris, Q. D., Kim, P. M., Kaiser, C. A.,Myers, C. L., Andrews, B. J. & Boone, C. The genetic landscape of acell. Science. 327, 425-431, doi:10.1126/science.1180823 (2010)), andfly (for 11 imaging phenotypes: Laufer, C., Fischer, B., Billmann, M.,Huber, W. & Boutros, M. Mapping genetic interactions in human cancercells with RNAi and multiparametric phenotyping. Nat Methods. 10,427-431, doi:10.1038/nmeth.2436 (2013)), and to the limited testedextent, mammals (for 60 genes: Bassik, M. C., Kampmann, M., Lebbink, R.J., Wang, S., Hein, M. Y., Poser, I., Weibezahn, J., Horlbeck, M. A.,Chen, S., Mann, M., Hyman, A. A., Leproust, E. M., McManus, M. T. &Weissman, J. S. A systematic mammalian genetic interaction map revealspathways underlying ricin susceptibility. Cell. 152, 909-922,doi:10.1016/j.cell.2013.01.030 (2013). PMCID:3652613).

Matrix (Tensor) Completion.

All the values of a matrix (tensor) are filled in using a smallcollection of sampled entries. Applicants hypothesize that the rank of atensor of higher-order interactions is a fraction of the number oftested genes which is tested by calculating the rank from a densesampling of second or third order knockouts from a small collection ofgenes. If the rank of interactions is limited, then Applicants randomlysample sets of genes to knockout from a larger collection, and fill inthe remaining values via nuclear norm regularized least-squaresoptimization (Candes, E. J. & Plan, Y. Matrix Completion With Noise.Proceedings of the IEEE. 98, 925-936, doi:Doi 10.1109/Jproc.2009.2035722(2010)). Provable guarantees suggest that if the rank, r, is smallrelative to the number of genes, n, then m≥O(n^(6/5) r log n) sampledentries are sufficient. However, since these guarantees assume roughuniformity in the loadings of interaction singular vectors, thisassumption is unlikely to hold if the interaction matrix is very sparse.In this case, Applicants perform the same random sampling, andsimultaneously regularize over both the nuclear norm and the L1 norm ofthe matrix (Richard, E., Savalle, P. & Vayatis, N. Estimation ofSimultaneously Sparse and Low Rank Matrices. arXiv.doi:arXiv:1206.6474).

Compressed Sensing

Here, instead of working with a tensor of interaction terms, Applicantswork with a basis that spans all higher order interactions. Each singlequantitative phenotype is a real-valued function f(g) on possiblegenotypes g (the 2^(m) possible allelic or knockout states), representedas binary strings of length m. Applicants analyze such Boolean functionsusing Fourier decomposition (O'Donnell, R. Analysis of Booleanfunctions. (Cambridge University Press, 2014))

${{f(g)} = {\sum_{{{be}\lbrack{0,1}\rbrack}^{m}}{{\hat{f}}_{b}( {- 1} )}^{b \cdot g}}},{{\hat{f}}_{b} = {\frac{1}{2^{m}}{\sum_{{{ge}\lbrack{0,1}\rbrack}^{m}}{{f(g)}( {- 1} )^{g \cdot b}}}}},$

where f is an orthogonal basis indexed by binary strings b, and eachFourier coefficient {circumflex over (f)}_(b) precisely quantifies theeffect of one possible multi-gene interaction. For example with m=2,{circumflex over (f)}₀₀ is the average phenotype; {circumflex over(f)}₁₀ is the effect of the first gene KO, marginalized over the geneticbackground of the second; similarly for {circumflex over (f)}₀₁; and{circumflex over (f)}₁₁ quantifies the two-way interaction (the extentto which the double KO phenotype differs from that predicted by the sumof the effects of the single KOs). Applicants hypothesize that suchgenotype-phenotype maps are approximately sparse in the Fourier basis,such that there is a small number, s, of nonzero Fourier coefficients(not known a priori). With perturbations generated only up to a limitedorder, Applicants obtain a truncated Fourier model, which is a generallinear model: the genetic interactions are in the basis functions(encoded into a design matrix), and the response is linear in theunknown Fourier coefficients. Applicants assume most truncatedcoefficients are negligible. Assuming that the genotype-phenotype mapsare approximately sparse in the Fourier basis, Applicants useL1-penalized regression to learn the coefficients of the map from pairedgenotype-phenotype observations g_(i), f(g_(i)) (with uncertainty ornoise in both).

Compressed sensing posits that if Applicants' perturbations arede-coherent under the given basis, then exact recovery is possible withdramatic under-sampling (in the noiseless case) (Candes, E. Compressivesampling. Proceedings of the International Congress of MathematiciansMadrid, Aug. 22-30, 2006. 3, 19, doi:10.4171/022 (2006)), such that asample size n=C s log p will suffice, where s is the number ofeffectively nonzero coefficients, p is the magnitude of combinatorialexpansion and C depends on noise and experimental design (how the g_(i)are sampled) (Candes, E. Mathematics of sparsity (and few other things).ICM 2014 Proceedings, to appear. (2014)). By varying the penalizationparameter, Applicants learn sparse structures at different levels ofthresholding, and find the level below which the data becomeinsufficient to capture the signal (Hastie, T., Friedman, J. &Tibshirani, R. The elements of statistical learning. Vol. 2 (Springer,2009)). Applicants explore using a larger penalization parameter on thehigher order interaction coefficients, and, with good estimates ofsingle perturbations, even no penalty on the linear terms, or regressingthose out first. If each experiment is a Poisson random sampling of KOs,Applicants expect the measurements to have good de-coherence under theFourier basis, provided the mean number of KO experiments per gene isnot too low. If Applicants' assumptions are correct, a soft phasetransition in performance as the number of observations crosses athreshold should be observed. Applicants use a small complete dataset ordownsampling of a larger more random dataset, to assess if theappropriate transition is observed.

Kernel Learning.

If there is no strict sparsity in the rank or in the coefficients,Applicants build predictive functions of the effects of combinatorialperturbations, using a kernel of experimental similarity. Given mexperiments, Applicants define an m×m polynomial kernel, for example,based on the overlap in knockouts between any pair of experiments.Applicants learn a weighted combination of kernel vectors that fits acollection of training data, and use the coefficients to predict theoutcome of new experiments. Here, the density of nonlinear interactionterms can be much greater, since Applicants do not directly learn anyparticular interaction coefficient, but rather a kernelized version ofthe entire polynomial. Indeed, if the interaction terms are too sparse,kernel learning is unlikely to be successful with under-sampling.

Applicants analyzed 3-way interaction data measured by overexpression ofevery 3-way combination of 39 miRNAs and a phenotype of drug resistance,and confirmed substantial sparsity in the data. Applicants analyzed the5-way interactions affecting expression profiles in response to salt inyeast between the MAPK Hogl (p38 ortholog) and 4 TFs (1, 2, 3, 4, and 5KO: 32 perturbations). Using a (non-regularized) linear model,Applicants quantified 1- and 2-way interactions, finding diversenon-linearities.

Analyzing a Cell Population at the Single Cell Level

The method according to the invention may comprise a step forsingle-cell molecular profiling. In some embodiments the step maycomprise processing said cell population in order to physically separatecells. In some embodiments the step may comprise single-cellmanipulation, e.g. using microfluidics based techniques. In someembodiments the step may comprise reverse emulsion droplet-basedsingle-cell analysis or hydrogel droplet-based single-cell analysis.

The method of the invention may use microfluidics, e.g. to culture cellsin specific combinations, control the spatiotemporal signals theyreceive, and/or trace and sample them as desired.

Molecular Profiling at the Single Cell Level

The method according to the invention may comprise a step forsingle-cell molecular profiling. This step may involve analyzingbiomolecules quantitatively or semi-quantitatively. The biomolecules mayinclude RNA, mRNA, pre-mRNA, proteins, peptides, chromatin or DNA. Saidanalysis may be performed genome-wide. Said analysis may be coupled(dual or sequential analysis of two or more types of biomolecules).

In some embodiments the step may comprise single-cell genomic profiling,single-cell RNA profiling, single-cell DNA profiling, single-cellepigenomic profiling, single-cell protein profiling, or single-cellreporter gene expression profiling. Proteins that may be used to altergenomic and epigenomic state are discussed in Shmakov et al., 2015,Molecular Cell 60, 1-13 and Zetsche et al., 2015, Cell 163, 759-771.

In some embodiments the step may comprise single-cell RNA abundanceanalysis, single-cell transcriptome analysis, single-cell exomeanalysis, single-cell transcription rate analysis, or single-cell RNAdegradation rate analysis.

In some embodiments the step may comprise single-cell DNA abundanceanalysis, single-cell DNA methylation profiling, single-cell chromatinprofiling, single-cell chromatin accessibility profiling, single-cellhistone modification profiling, or single-cell chromatin indexing.

In some embodiments the step may comprise single-cell protein abundanceanalysis, single-cell post-translational protein modification analysis,or single-cell proteome analysis.

In some embodiments the step may comprise single-cell mRNA reporteranalysis, detection or quantification.

In some embodiments the step may comprise single-cell dual molecularprofiling, such any combination of two amongst single-cell RNAprofiling, single-cell DNA profiling, single-cell protein profiling,mRNA reporter analysis.

The method of the invention may include at the step determining singlecell RNA levels. For single cell RNA-Seq (scRNA-Seq), one may useDrop-Seq (Macosko, E. Z., Basu, A., Satija, R., Nemesh, J., Goldman, M.,Tirosh, I., Bialas, A. R., Kamitaki, N., Sanes, J. R., Weitz, D. A.,Shalek, A. K., Regev, A. & McCarroll, S. A. Highly Parallel Genome-wideExpression Profiling of Individual Cells Using Nanoliter Droplets. Cell.2015 May 21; 161(5):1202-14. doi: 10.1016/j.cell.2015.05.002.PMCID:4481139) and variants thereof. This technique relies onreverse-emulsion, early barcoding for analyzing 10⁴-10⁶ cells/experimentat very low cost. Drop-Seq enables to co-encapsulate individual cellswith uniquely barcoded mRNA capture beads in reverse emulsion droplets.After lysis and mRNA capture, the emulsion is broken and all beads/cellsare processed (RT, library prep) together, deconvolving each cell'sprofile from bead barcodes. In some embodiments, droplets cancompartmentalize hundreds of cells/sec, are stable over time and toheat, and can serve as micro-vessels to add reagents; after RT, barcodedbeads are stable and can be sorted or subselected. Sampling noise fromshallow read depth is substantially lower than the technical variabilitybetween cells (Shalek, A. K., Satija, R., Shuga, J., Trombetta, J. J.,Gennert, D., Lu, D., Chen, P., Gertner, R. S., Gaublomme, J. T., Yosef,N., Schwartz, S., Fowler, B., Weaver, S., Wang, J., Wang, X., Ding, R.,Raychowdhury, R., Friedman, N., Hacohen, N., Park, H., May, A. P. &Regev, A. Single-cell RNA-seq reveals dynamic paracrine control ofcellular variation. Nature. 510, 363-369, doi:10.1038/nature13437(2014). PMCID:4193940.), so one may sufficiently estimate expressionwith ˜100,000 reads per cell for many applications (especially with a 5′or 3′-end protocol, Satija, R., Farrell, J. A., Gennert, D., Schier, A.F. & Regev, A. Spatial reconstruction of single-cell gene expressiondata. Nature biotechnology. 33, 495-502, doi:10.1038/nbt.3192 (2015)).

Single cell RNA may also be analyzed as discussed in Klein, A. M.,Mazutis, L., Akartuna, I., Tallapragada, N., Veres, A., Li, V., Peshkin,L., Weitz, D.A., Kirschner, M. W. Droplet barcoding for single-celltranscriptomics applied to embryonic stem cells. Cell. 2015 May 21;161(5):1187-201. doi: 10.1016/j.cell.2015.04.044. PMCID: 4441768.

The method of the invention may include determining RNA transcriptionand degradation rates. One may use RNA metabolically labeled with4-thiouridine, to measure RNA transcription and degradation rates(Rabani, M., Raychowdhury, R., Jovanovic, M., Rooney, M., Stumpo, D. J.,Pauli, A., Hacohen, N., Schier, A. F., Blackshear, P. J., Friedman, N.,Amit, I. & Regev, A. High-resolution sequencing and modeling identifiesdistinct dynamic RNA regulatory strategies. Cell. 159, 1698-1710,doi:10.1016/j.cell.2014.11.015 (2014). PMCID:4272607; Rabani, M., Levin,J. Z., Fan, L., Adiconis, X., Raychowdhury, R., Garber, M., Gnirke, A.,Nusbaum, C., Hacohen, N., Friedman, N., Amit, I. & Regev, A. Metaboliclabeling of RNA uncovers principles of RNA production and degradationdynamics in mammalian cells. Nature biotechnology. 29, 436-442,doi:10.1038/nbt.1861 (2011). PMCID:3114636).

The method of the invention may include a step of determining DNAmethylation. One may apply methods for reduced representation bisulfitesequencing (RRBS), targeted capture, and whole genome bisulfitesequencing of DNA methylation from bulk to ultra-low inputs (Chan, M.M., Smith, Z. D., Egli, D., Regev, A. & Meissner, A. Mouse ooplasmconfers context-specific reprogramming capacity. Nature genetics. 44,978-980, doi:10.1038/ng.2382 (2012). PMCID:3432711; Smith, Z. D., Chan,M. M., Humm, K. C., Karnik, R., Mekhoubad, S., Regev, A., Eggan, K. &Meissner, A. DNA methylation dynamics of the human preimplantationembryo. Nature. 511, 611-615, doi:10.1038/nature13581 (2014).PMCID:4178976; Smith, Z. D., Chan, M. M., Mikkelsen, T. S., Gu, H.,Gnirke, A., Regev, A. & Meissner, A. A unique regulatory phase of DNAmethylation in the early mammalian embryo. Nature. 484, 339-344,doi:10.1038/nature10960 (2012). PMCID:3331945) to single cells.

The method of the invention may include a step determining Chromatinaccessibility. This may be performed by ATAC-Seq. For massively parallelsingle cell ATAC-Seq one may implement a droplet-based assay. First,in-tube, one may use Tn5 transposase to fragment chromatin insideisolated intact nuclei and add universal primers at cutting sites. Next,in-drop, one may use a high diversity library of barcoded primers touniquely tag all DNA that originated from the same single cell.Alternatively, one may perform all steps in drop. One may also use astrategy that relies on split pooled nuclei barcoding in plates(Cusanovich, D. A., Daza, R., Adey, A., Pliner, H., Christiansen, L.,Gunderson, K. L., Steemers, F. J., Trapnell, C. & Shendure, J. Multiplexsingle-cell profiling of chromatin accessibility by combinatorialcellular indexing. Science. 2015 May 22; 348(6237):910-4. doi:10.1126/science.aab1601. Epub 2015 May 7). Applicants have optimized keysteps in a mixture of human and mouse cells, with specificity thatexceeds the initial performance of mRNA Drop-Seq. Applicants have alsoused a Fluidigm C1 protocol (seehttps://www.fluidigm.com/products/cl-system) to analyze ˜100 single DCs,closely reproducing ensemble measures, high enrichment in TSSs, andnucleosome-like periodicity.

ATAC-seq (assay for transposase-accessible chromatin) identifies regionsof open chromatin using a hyperactive prokaryotic Tn5-transposase, whichpreferentially inserts into accessible chromatin and tags the sites withsequencing adaptors [Pott and Lieb Genome Biology (2015) 16:172 DOI10.1186/s13059-015-0737-7 and Buenrostro J D, Giresi P G, Zaba L C,Chang H Y, Greenleaf W J. Transposition of native chromatin for fast andsensitive epigenomic profiling of open chromatin, DNA-binding proteinsand nucleosome position. Nat Methods. 2013; 10:1213-128]. Two verydifferent approaches were used: one relied on physical isolation ofsingle cells [Buenrostro J D, Wu B, Litzenburger U M, Ruff D, Gonzales ML, Snyder M P, et al. Single-cell chromatin accessibility revealsprinciples of regulatory variation. Nature. 2015; 523:486-90], and theother avoided single-cell reaction volumes by using a two-stepcombinatorial indexing strategy [Cusanovich D A, Daza R, Adey A, PlinerH A, Christiansen L, Gunderson K L, et al. Epigenetics. Multiplexsingle-cell profiling of chromatin accessibility by combinatorialcellular indexing. Science. 2015; 348:910-4].

In the indexing scheme, Cusanovich et al. [Cusanovich D A, Daza R, AdeyA, Pliner H A, Christiansen L, Gunderson K L, et al. Epigenetics.Multiplex single-cell profiling of chromatin accessibility bycombinatorial cellular indexing. Science. 2015; 348:910-4] lysed cells,and 2500 nuclei were placed into each well of a 96-well plate.Transposases loaded with unique adaptors were added to each well,creating 96 pools of approximately 2500 nuclei, each pool with distinctbarcodes. Nuclei from all of the transposition reactions were mixed, andusing a fluorescence-activated cell sorter (FACS) 15-25 nuclei weredeposited into each well of a second 96-well plate. Nuclei in each wellof this second plate were lysed, and the DNA was amplified using aprimer containing a second barcode. The low number of nuclei per wellensured that about 90% of the resulting barcode combinations were uniqueto a single cell. This combinatorial indexing strategy enabled therecovery of 500-1500 cells with unique tags per experiment. OverallCusanovich et al. obtained scATAC-seq data from over 15,000 individualcells from mixtures of GM12878 lymphoblastoid cells with HEK293, HL-60,or mouse Patski cells. The number of reads associated with any singlecell was very low, varying from 500 to about 70,000 with a median offewer than 3000 reads per cell.

Buenrostro et al. (Buenrostro J D, Wu B, Litzenburger U M, Ruff D,Gonzales M L, Snyder M P, et al. Single-cell chromatin accessibilityreveals principles of regulatory variation. Nature. 2015; 523:486-90)used a programmable microfluidic device (C1, Fluidigm) to isolate singlecells and perform ATAC-seq on them in nanoliter reaction chambers. Eachnanochamber was analyzed under a microscope to ensure that a singleviable cell had been captured. This approach is simple and has thesignificant advantage of a carefully monitored reaction environment foreach individual cell, although the throughput was limited to processing96 cells in parallel. Buenrostro et al. sampled 1632 cells from eightdifferent cell lines, including GM12878, K562, and H1 cells, andobtained an average of 73,000 reads per cell, about 20 times the numberof reads per cell obtained using the combinatorial barcoding strategy.

The method of the invention may include a step of determining histonemodifications and protein-DNA interactions. One may apply tools that usegenomic barcoding to index chromatin prior to immunoprecipitation toenable multiplexed analysis of limited samples and individual cells in asingle reaction. For single-cell chromatin profiling, one may useDrop-ChIP where the chromatin of individual cells is barcoded indroplets. Based on the Drop-Seq technique, one may encapsulate singlecells, lyse and MNase-digest chromatin, then fuse a second droplet withbarcoded oligos, ligate them to the fragmented chromatin, break theemulsion, add carrier chromatin, and carry out ChIP-Seq. this may beperformed using a protocol with split-pool barcoding to collect 10⁴-10⁵single cells/assay.

ChIP-sequencing, also known as ChIP-seq, is a method used to analyzeprotein interactions with DNA which may be used with perturbation.ChIP-seq combines chromatin immunoprecipitation (ChIP) with massivelyparallel DNA sequencing to identify the binding sites of DNA-associatedproteins. It can be used to map global binding sites precisely for anyprotein of interest. ChIP-seq is used primarily to determine howtranscription factors and other chromatin-associated proteins influencephenotype-affecting mechanisms. Determining how proteins interact withDNA to regulate gene expression is important for understanding manybiological processes and disease states. This epigenetic information iscomplementary to genotype and expression analysis. ChIP-seq technologyis as an alternative to ChIP-chip which requires a hybridization array.Specific DNA sites in direct physical interaction with transcriptionfactors and other proteins can be isolated by chromatinimmunoprecipitation. ChIP produces a library of target DNA sites boundto a protein of interest in vivo. Massively parallel sequence analysesare used in conjunction with whole-genome sequence databases to analyzethe interaction pattern of any protein with DNA, see, e.g., Johnson D S,Mortazavi A et al. (2007) Genome-wide mapping of in vivo protein— DNAinteractions. Science 316: 1497-1502, or the pattern of any epigeneticchromatin modifications. This can be applied to the set of ChIP-ableproteins and modifications, such as transcription factors, polymerasesand transcriptional machinery, structural proteins, proteinmodifications, and DNA modifications. See, e.g., “Whole Genome ChromatinIP Sequencing,” Illumina, Inc (2010), available athttp://www.illumina.com/Documents/products/datasheets/datasheet_chip_sequence.pdf(Chromatin Immunoprecipitation with massively parallel sequencing).

For multiplex analysis of (limited) bulk samples, one may rely onchromatin indexing (MINT-ChIP; iChIP), where MNase-fragmented chromatinare indexed by ligation to a uniquely barcoded adaptor and then pooledand processed in multiplex through all subsequent phases, either with(MINT-ChIP) or without (iChIP: Lara-Astiaso, D., Weiner, A.,Lorenzo-Vivas, E., Zaretsky, I., Jaitin, D. A., David, E., Keren-Shaul,H., Mildner, A., Winter, D., Jung, S., Friedman, N. & Amit, I.Immunogenetics. Chromatin state dynamics during blood formation.Science. 345, 943-949, doi:10.1126/science.1256271 (2014).PMCID:4412442) carrier chromatin (without adaptors).

The method of the invention may include a step of determining proteins.Recently developed assays (e.g., CyTOF: Bendall, S. C., Simonds, E. F.,Qiu, P., Amir el, A. D., Krutzik, P. O., Finck, R., Bruggner, R. V.,Melamed, R., Trejo, A., Ornatsky, O. I., Balderas, R. S., Plevritis, S.K., Sachs, K., Pe'er, D., Tanner, S. D. & Nolan, G. P. Single-cell masscytometry of differential immune and drug responses across a humanhematopoietic continuum. Science. 332, 687-696,doi:10.1126/science.1198704 (2011). PMCID:3273988), allow multiplexed,single cell detection of dozens of proteins in millions of cells, butrely on antibodies and cannot yet be combined with DNA readout. Darmaniset al. Simultaneous Multiplexed Measurement of RNA and Proteins inSingle Cells, Cell 14, (2016)—uses PEA and RT-qPCR to detect proteinsand mRNA species from the same single cell using split lysates (76proteins, 96 mRNA) (see also Genshaft et al. Multiplexed TargetedProfiling of Single-Cell Proteomes and Transcriptomes in a SingleReaction, submitted to Genome Biology. Frei et al. Highly multiplexedsimultaneous detection of RNAs and proteins in single cells, NatureMethods (2016)—uses cytof to measure approximately 40 combined targetsbetween mRNA and protein. Conversely, mass spectrometry (LC-MS/MS)allows quantitative analysis of entire proteomes, but deep analysisrequires large amounts of protein/cells. To measure single cell proteinlevels and post-translational modifications (PTMs), one may use one ofthree complementary antibody-based assays: (1) standard flow cytometrywith a few proteins/PTMs, >10⁶ single cells); (2) CyTOF (Bendall, S. C.,Simonds, E. F., Qiu, P., Amir el, A. D., Krutzik, P. O., Finck, R.,Bruggner, R. V., Melamed, R., Trejo, A., Ornatsky, O. I., Balderas, R.S., Plevritis, S. K., Sachs, K., Pe'er, D., Tanner, S. D. & Nolan, G. P.Single-cell mass cytometry of differential immune and drug responsesacross a human hematopoietic continuum. Science. 332, 687-696,doi:10.1126/science.1198704 (2011). PMCID:3273988) (heavy metal labelingwith multiplex barcoding; ˜30-50 proteins/PTMs, 10⁵-10⁶ single cells);and (3) novel, highly multiplexed, DNA sequencing-based readouts ofprotein levels (100s proteins/PTMs; 10⁶ cells). For sequencing basedreadouts, one may use one of two approaches, geared at detectinghundreds of proteins in single cells: Immuno-Seq (when antibodies can bewashed out: Niemeyer, C. M., Adler, M. & Wacker, R. Detecting antigensby quantitative immuno-PCR. Nat Protoc. 2, 1918-1930,doi:10.1038/nprot.2007.267 (2007)) and proximity extension assays (PEA,when antibodies cannot be washed away: Hammond, M., Nong, R. Y.,Ericsson, O., Pardali, K. & Landegren, U. Profiling cellular proteincomplexes by proximity ligation with dual tag microarray readout. PLoSOne. 7, e40405, doi:10.1371/journal.pone.0040405 (2012). PMCID:3393744;Nong, R. Y., Wu, D., Yan, J., Hammond, M., Gu, G. J., Kamali-Moghaddam,M., Landegren, U. & Darmanis, S. Solid-phase proximity ligation assaysfor individual or parallel protein analyses with readout via real-timePCR or sequencing. Nat Protoc. 8, 1234-1248, doi:10.1038/nprot.2013.070(2013); Stahlberg, A., Thomsen, C., Ruff, D. & Aman, P. Quantitative PCRanalysis of DNA, RNAs, and proteins in the same single cell. Clin Chem.58, 1682-1691, doi:10.1373/clinchem.2012.191445 (2012).) These useDNA-sequence based encoding, and are compatible with other genomicreadouts (e.g., sgRNA barcodes). DNA-sequence tags can be conjugated toantibodies (Janssen, K. P., Knez, K., Spasic, D. & Lammertyn, J. Nucleicacids for ultra-sensitive protein detection. Sensors (Basel). 13,1353-1384, doi:10.3390/s130101353 (2013). PMCID:3574740), nanobodies(Pardon, E., Laeremans, T., Triest, S., Rasmussen, S. G., Wohlkonig, A.,Ruf, A., Muyldermans, S., Hol, W. G., Kobilka, B. K. & Steyaert, J. Ageneral protocol for the generation of Nanobodies for structuralbiology. Nat Protoc. 9, 674-693, doi:10.1038/nprot.2014.039 (2014).PMCID:4297639; Theile, C. S., Witte, M. D., Blom, A. E., Kundrat, L.,Ploegh, H. L. & Guimaraes, C. P. Site-specific N-terminal labeling ofproteins using sortase-mediated reactions. Nat Protoc. 8, 1800-1807,doi:10.1038/nprot.2013.102 (2013). PMCID:3941705) or aptamers (Janssen,K. P., Knez, K., Spasic, D. & Lammertyn, J. Nucleic acids forultra-sensitive protein detection. Sensors (Basel). 13, 1353-1384,doi:10.3390/s130101353 (2013). PMCID:3574740.).

Spatially Patterning Cells on Surfaces

In one embodiment, a biocompatible surface for patterning cells isprepared. This surface can be inert (e.g., a functionalized glass slide)or biological (e.g. cells) . The cell functionalizing probe is thenflowed over the surface and the region where cell type 1 is to be placedis photoactivated. Excess cell functionalizing probe is then washedaway, and cell functionalizing barcoded tag (e.g., an oligo) is beflowed over the surface, selectively attaching as outlined below in theregion that was photoactivated.

In one embodiment, the cell functionalizing tag (e.g., oligo tag) isclick enabled (e.g., azide modified) and will react with the clickmoiety on the probe (e.g., a strained alkene or strained alkyne).

In another embodiment, streptavidin is flowed over the surface, whichcan bind the biotin on the cell functionalizing probe covalentlyattached to the surface. Biotin functionalized cell functionalizingbarcoded tag (e.g., biotin functionalized oligos) are then flowed overthe surface and bind to the streptavidin which is attached to the cellfunctionalizing probe that is, in turn, covalently attached to thesurface.

After the cell functionalizing barcoded tag (e.g., oligos) have beenadded for cell type 1, this process will be repeated from the beginningto add another cell functionalizing probe and tag (e.g., oligo) specificfor cell type 2, and so on until n cell type specific cellfunctionalizing barcoded tag (e.g., oligo) have been conjugated to thesurface.

In parallel, different cell functionalizing barcoded tags (e.g., oligos)can conjugated to cell types 1 through n. In one embodiment, the cellfunctionalizing barcoded tag (e.g., oligos) are the reverse complementof the cell functionalizing barcoded tag (oligos) conjugated to thesurface, specific for cell types 1 through n. In this way, cell types1-n are placed in the locations specified by oligos 1-n.

In one embodiment, the cell functionalizing barcoded tag (e.g., oligos)are covalently attached to the cells via a click reaction. Morespecifically, an NETS-click reagent (DBCO, OND, etc.) is covalentlyattached to cells (NETS reacts with primary amines on cell surfaceproteins). Afterwards, each cell type is incubated with a cellfunctionalizing barcoded tag (e.g., an azide oligo) unique to that celltype (which is the reverse complement of the oligo tag patterned on thesurface for the placement of that cell type).

In another embodiment, the cell functionalizing barcoded tag (e.g.,oligos) are bound to the cell via a biotin-streptavidin-biotin linkage.More specifically, an NHS-biotin reagent is covalently attached tocells. After excess NHS-biotin has been washed away, cells are incubatedwith streptavidin. Afterwards each cell type is incubated with its celltype specific oligonucleotide (which is the reverse complement of theoligonucleotide that was patterned onto the surface for each cell type'splacement).

DNA conjugation to cells has also been accomplished by Staudingerligation (Gartner and Bertozzi, Programmed assembly of 3-dimensionalmicrotissues with defined cellular connectivity, 2009), hydrazoneconjugation (Twite et al., Direct attachment of microbial organisms tomaterial surfaces through sequence-specific DNA hybridization, 2012),incorporation of dialkyl-DNA into the cell membrane (Selden et al.,Chemically Programmed Cell Adhesion with Membrane-AnchoredOligonucleotide, 2012), and fatty-acid-conjugated duplex DNA (Weber etal., Efficient Targeting of Fatty-Acid Modified Oligonucleotides to LiveCell Membranes through Stepwise Assembly, 2014). In an embodiment (seemspreferred in the field; fatty-acid conjugated duplex of DNA), two singlestranded DNAs conjugated to lipids are added to the cells sequentially.The first oligonucleotide added is long and contains a region that isthe reverse complement of the DNA patterned on the surface. It is mixedwith the cells and inserted into their plasma membrane. The second,shorter oligonucleotide is then added, and it is complementary to theDNA proximal to the lipid of the first oligonucleotide.

After cells have been conjugated to oligonucleotidess, the cells areadded to the surface and are spatially patterned according to their cellsurface oligonucleotide and its hybridization to the oligonucleotides onthe surface.

Subsequent rounds of addition can be used to build complex 3-dimensionalarchitectures as well as 2-dimensional ones.

EXAMPLES

Using the cell functionaling probe and cell functionalizing barcoded tagdescribed herein with fine dissection tools, Applicants studied theregionality within the “macro-environment,” i.e., ˜10⁴ cells. Applicantsdissected a large (˜2.5 mm^3) MC38 tumor from a mouse model into 3isolates based on location: section 1 (peripheral margin, close tobody), section 2 (core), and section 3 (intermediate zone radially, skinside). Applicants FACS sorted T cells, macrophages, and tumor cells andcompleted single-cell RNA-Sequencing. High resolution structuralinformation was retained through tissue dissociation and subsequentsingle-cell RNA-Sequencing, as illustrated in FIG. 13B using aprojection of the top principal components regional differences betweeneach section.

Applicants further studied necropsy of non-human primate with singlecell RNA-Sequencing of complete tissue composition across ˜10 tissues.Applicants revealed that tissues with similar functionalities andanatomic designations (e.g. lymph node, e.g. different regions along thesmall intestine) have distinct cellular composition; presumably relevantfor local biology and specialized function. Distinct tissues weredissociated from necropsy and single-cell RNA-Sequencing was completedon all viable cells. As illustrated in FIG. 15 , even functionally andanatomically similar tissues, such as the Iliac lymph node and theSubmandibular lymph node, are composed of differing frequencies of celltypes and thus show differing projections into tSNE space. Applicantsalso observed that 3 secondary lymphoid tissues exhibit large variationin the frequency of T cells (as determined by CD3 delta chain).

Applicants demonstrated the principle that unique cellular phenotypesemerge based on tissue compartment (and therefore, local regionaleffects). From the same donor, Applicants collected the blood and sputumand completed single cell RNA-Sequencing on the isolated cells. FIG. 16illustrates that the dominnat factor in cell-cell variability is thecell type (e.g. neutrophils cluster with neutrophils, lymphocytescluster separately. In addition, the principal components analysis plotshows that indeed the tissue compartments are a major source ofvariability between cells of the “same type” (here: activatedneutrophils). Applicants observed that the phenotype between cell typesthat are canonically considered “equivalent” are actually majorlydefined by the microenvironment of origin. Moreover, Applicants observedthat an entire subset of cell phenotype (immature neutrophils) exists inthe blood of human donors, but not present in the sputum.

A mouse tumor model (MC38, colon carcinoma) was used as an archetypaltissue where intratumoral regionality could be observed and leveraged tobetter understand modes of cell-cell interaction (microenvironmenteffects). Applicants observed that cell behaviors are altered by theirlocal neighborhood. Different cell types in the tumor tissue, T cells,Macrophages, and Tumor cells, were FACS sorted. T cells, sorted aspreviously described, from multiple regions in a single tumor wereanalyzed for regionally-distinct phenotypes. This analysis is motivatedby the assumption that regionally distinct phenotypes may representregional alterations in the interacting cells. Applicants looked atexhaustion, a T cell phenotype largely defined by the interaction withexhaustion-inducing tumor cells and potentially other cell mediators andsecreted molecules. Applicants observed that certain regions of thetumor contain cells with exhaustion phenotypes that are similar to eachother (within a microenvironment), yet are distinct from other cells indistant regions (FIG. 20 ). Moreover, the addition of additionalmetadata of space into traditional single-cell RNA-Sequencing datasetsnecessitates new formalisms to analyze cell-cell interaction andrationality.

Further, Applicants observed that specific cellular phenotypes arealtered between different sections in a tumor structure, indicating thatthe microenvironment, interacting cell types, and the spatialrelationships between each of these units plays a critical role inemergent tissue behavior and phenotype. As illustrated in FIG. 21A,cells in the center of a tumor structure exhibit a strong signature forhypoxia. FIG. 21B demonstrates the T cells that segregate betweendifferent tumor regions express different interferon signaling pathwaycomponents, and at different magnitudes. This indicates immunity isregionally confined and reacts differently depending on the localinfluences.

Having thus described in detail preferred embodiments of the presentinvention, it is to be understood that the invention defined by theabove paragraphs is not to be limited to particular details set forth inthe above description as many apparent variations thereof are possiblewithout departing from the spirit or scope of the present invention.

1.-12. (canceled)
 13. A cell functionalizing barcoded tag comprising:(a) a polyadorned molecule, wherein the molecule comprises 2 to 5substituents comprising; (i) a click-enabled moiety, wherein theclick-enabled moiety comprises an azide, tetrazine, tetrazole, ornitrone; and, (ii) an oligonucleotide barcode comprising a spatialbarcode.
 14. The cell functionalizing barcoded tag of claim 13, whereinthe polyadorned molecule comprises a label.
 15. The cell functionalizingbarcoded tag of claim 14, wherein the label omprises a fluorophore, apeptide-based tag, biotin, a oligonucleotide, a hapten, affinityreagent, lanthanide heavy metal(s) or combination thereof, or acyanine-based dye.
 16. The cell functionalizing barcoded tag of claim15, wherein the peptide-based tag comprises FLAG-tag, V5 tag, HA-tag,AviTag, Calmodulin-tag, polyglutamate tag, E-tag, His-tag, Myc-tag,S-tag, SBP-tag, Softag 1, Softag 3, Strep-tag, TC tag, VSV-tag, orXpress tag. 17.-34. (canceled)